Illinois Data Bank
Displaying 176 - 200 of 901 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2025-07-11
 
This dataset includes experimental data supporting the findings in the manuscript "Magnetostriction and Temperature Dependent Gilbert Damping in Boron Doped Fe80Ga20 Thin Films". It contains raw data for X-Ray diffraction, high resolution transmission electron microscopy, magnetic hysteresis loop measurement, magnetostriction measurement, and temperature dependent magnetic damping measurement.
keywords: magnetostriction; magnetic damping; magnetoelasticity; magnon-phonon coupling
published: 2025-07-09
 
This dataset contains the raw transmission electron microscopy (TEM) and scanning electron microscopy (SEM) images used to calculate the synthesis yield of patchy nanoparticles (NPs), as described in Supplementary Table 1 of the paper “Patchy Nanoparticles by Atomic “Stencilling” (2025).” All the images were taken at the Materials Research Laboratory, University of Illinois at Urbana-Champaign by Qian Chen group. 1. We have 21 subfolders, each with a name corresponding to one of the 21 patchy NPs listed in Supplementary Table 1 of the paper “Patchy Nanoparticles by Atomic “Stencilling” (2025)." 2. In TEM images, the bright and dark regions indicate the polymer patches and NP cores, respectively. 3. In SEM images, the bright and dark regions indicate the NP cores and polymer patches, respectively. 4. Each subfolder contains a “readme (subfolder name).txt” file with more detailed information about each sample.
keywords: Patchy nanoparticle; polymer; synthesis; self-assembly
published: 2025-04-29
 
This page contains the data for the publication "The pioneer transcription factor Zelda controls the exit from regeneration and restoration of patterning in Drosophila" published in the journal Science Advances.
keywords: Drosophila; regeneration; wing imaginal disc; Zelda
published: 2025-05-10
 
This dataset provides instructions for procedures to use heat transfer analyses to estimate thermal conditions in artificial roosts for bats. The dataset contains scripts to employ in the program GNU Octave, example meteorology data, and example text files specifying roost dimensions and material properties.
keywords: Bat box; design; heat storage; heat transfer analysis; insulation; temperature
published: 2025-06-16
 
Data for the publication of Magnetic Fields in the Pillars of Creation (Sarkar et al.). Contains the fits files and python scripts.
keywords: HAWC+; SOFIA; Pillars of Creation; M16; Eagle Nebula; Dust Polarization
published: 2019-10-27
 
This dataset accompanies the paper "STREETS: A Novel Camera Network Dataset for Traffic Flow" at Neural Information Processing Systems (NeurIPS) 2019. Included are: *Over four million still images form publicly accessible cameras in Lake County, IL. The images were collected across 2.5 months in 2018 and 2019. *Directed graphs describing the camera network structure in two communities in Lake County. *Documented non-recurring traffic incidents in Lake County coinciding with the 2018 data. *Traffic counts for each day of images in the dataset. These counts track the volume of traffic in each community. *Other annotations and files useful for computer vision systems. Refer to the accompanying "readme.txt" or "readme.pdf" for further details.
keywords: camera network; suburban vehicular traffic; roadways; computer vision
published: 2025-04-17
 
This dataset includes analysis code used to analyze the data involved with swapping photons between superconducting qubits in separate modules though a superconducting coaxial cable bus. The dataset includes Python code to model and plot the data, CAD designs of the modules that hold the superconducting qubits, high frequency simulation software files to model the electric fields of the superconducting circuits
keywords: superconducting qubits; qunatum information; modular architecture
published: 2025-06-30
 
This dataset contains measurements of water loss as white-tailed deer (Odocoileus virginianus) retroypharyngeal lymph nodes air-dried in a refrigerator for 31 days. Daily weights for lymph nodes are recorded every 24 hours, as are the variables "firmness" and "surface wetness". "Firmness" is a categorical variable measuring how much the tissue deforms to the touch (soft, medium, or hard). "Surface wetness" is the amount of visible moisture on the outside of the lymph node (all, some, or none). Lymph node weights were measured until their weights stabilized for 3 consecutive days at two decimal places (ex. 3.02, 3.02, 3.02) or until the weights fluctuated only by 0.01 (ex. 3.02, 3.03, 3.02). Lymph nodes were from northern Illinois white-tailed deer collected as part of the Illinois Department of Natural Resources' ongoing chronic wasting disease (CWD) management efforts.
keywords: cervid; lymph node; chronic wasting disease; cwd; diagnostic testing; dessication; drying; tissue
published: 2025-06-30
 
This dataset is associated with the manuscript "Residual tau-fluvalinate, a beehive acaricide, disrupts growth and metabolism in the greater wax moth, Galleria mellonella" This dataset includes 2 Excel files: 1) raw_data_bioassay.xlsx: this file contains the raw data for waxworm bioassay. There are 2 worksheets within this file: - LC50: raw data for measuring the LC50 of Galleria mellonella (greater wax moth) in laboratory and field strains exposed to tau-fluvalinate. - RGR: Relative Growth Rate, raw data for measuring body weight of field strain of Galleria mellonella exposed to tau-fluvalinate. 2) raw-data_RT-qPCR.xlsx: this file contains raw data (Ct value) of RT-qPCR.
keywords: Apis mellifera; cytochrome P450; tau-fluvalinate; detoxification genes; waxworm
published: 2025-06-26
 
This dataset encompasses experimental results supporting the upcoming journal paper, "Laboratory-scale assessment of CO2 sealing potential for heterogeneous caprock", which investigates the sealing potential of heterogeneous caprock. The dataset includes the measurements and analyses conducted under controlled laboratory conditions, capturing sealing potential such as permeability and breakthrough pressure.
keywords: Heterogeneity; CO2 breakthrough pressure; Intrinsic permeability; Capillary pressure curve
published: 2025-05-05
 
The dataset includes responses from approximately 550 participants to survey questions about trust in images labeled with AI-related tags, compared to other images found online. The questions also explore how the type of label influences their trust.
keywords: Artificial intelligence (AI); Trust in AI; Al labeling; AI ethics
published: 2025-06-05
 
There are two files in this dataset. File1: AffiNorm AffiNorm contains 1,001 rows, including one header row, randomly sampled from MapAffil 2018 Dataset ([**https://doi.org/10.13012/B2IDB-2556310_V1**](https://databank.illinois.edu/datasets/IDB-2556310)). Each row in the file corresponds to a particular author on a particular PubMed record, and contains the following 26 columns, comma-delimited. All columns are ASCII, except city which contains Latin-1. COLUMN DESCRIPTION 1. PMID: the PubMed identifier. int. 2. ORDER: the position of the author. int. 3. YEAR - The year of publication. int(4), eg: 1975. 4. affiliation - affiliation string of the author. eg: Department of Pathology, University of Chicago, Illinois 60637. 5. annotation_type: the number of institutions annotated, denoted by S, M, O, or Z, where "S" (single) indicates 1 institution was annotated; "M" (Multiple) indicates more than one institutions were annotated; "O" (Out of Vocabulary or None) indicates no institution was annotated, but an institution was apparently mentioned; "Z" indicates no institution was mentioned. 6. Institution: the standard name(s) of the annotated institution(s), according to ROR. if "S" (single institution), it is saved as a string, eg: University of Chicago; if "M", it is saved as a string that looks like a python list, eg: ['Public Health Laboratory Service'; 'Centre for Applied Microbiology and Research']; if "O" or "Z", then blank. 7. inst_type: the type of institution, according to ROR. the potential values are: education, funder, healthcare, company, archive, nonprofit, government, facility, other. An institution may have more than one type, eg: ['Education', 'Funder'] 8. type_edu: TRUE if the inst_type contains "Education"; FALSE otherwise. 9. RORid: ROR identifier(s), eg: https://ror.org/05hs6h993. when multiple, the order corresponds to institution (column 6) 10. RORid_label. the standard name(s) of the annotated institution(s) according to ROR.same as institution (column 6) 11. GRIDid: GRID identifier(s). eg: grid.170205.1 12. GRIDid_label: the standard name(s) of the annotated institution(s) according to GRID. eg: University of Chicago. 13. WikiDataid: WikiData identifier(s). eg: Q131252 14. WikiDataid_label: the standard name(s) of the annotated institution(s) according to WikiData. eg: University of Chicago 15. synonyms: a comma separated list of variant names from InsVar (file 2) . format of string. eg: University of Chicago, Chicago University, U of C, UChicago, uchicago.edu, U Chicago, ... 16. MapAffil-grid: GRID from the MapAffil 2018 Dataset. 17. MapAffil-grid_label: The standard name of institution from MapAffil 2018 Dataset. 18. judge_mapA: TRUE if GRIDid (column 11) contains MapAffil-grid (column 16); FALSE otherwise. 19. MapAffiltemporal-grid: GRID from the temporal version of MapAffil, http://abel.ischool.illinois.edu/data/MapAffilTempo2018.tsv.gz 20. MapAffiltemporal-grid_label: The standard name of institution from MapAffilTemporal 2018 Dataset. 21. judge_mapT: TRUE if GRIDid (column 11) contains MapAffiltemporal-grid (column 19); FALSE otherwise. 22. RORapi_query_id: ROR from ROR api tool (query endpoint) 23. RORapi_query_id_label: The standard name of institution from ROR api tool (query endpoint). format in string. 24. judge_rorapi_affiliation: TRUE if RORid (column 9) contains RORapi_query_id (column 22); FALSE otherwise. 25. rorapi_affiliation_id: ROR from ROR api tool (affiliation endpoint). 26. judge_rorapi_affiliation: TRUE if RORid (column 9) contains RORapi_affiliation (column 25); FALSE otherwise. File 2: insVar.json InsVar is a supplementary dataset for AffiNorm, which includes the institution ID and its redirected aliases from wikidata. The institution ID list is from GRID, the redirected aliases are from wiki api, for example: https://en.wikipedia.org/wiki/Special:WhatLinksHere?target=University+of+Illinois+Urbana-Champaign&namespace=&hidetrans=1&hidelinks=1&limit=100 In InsVar, the data is saved in a python dictionary format. the key is the GRID identifier, for example: "grid.1001.0" (Australian National University), and the value is a list of redirected aliases strings. {"grid.1001.0": ["ANU", "ANU College", "ANU College of Arts and Social Sciences", "ANU College of Asia and the Pacific", "ANU Union", "ANUSA", "Asia Pacific Week",    "Australia National University", "Australian Forestry School", "the Australian National University", ...], "grid.1002.3": ...}
keywords: PubMed; MEDLINE; Digital Libraries; Bibliographic Databases; Institution Names; Author Affiliations; Institution Name Ambiguity; Authority files
published: 2021-08-27
 
The dataset shows all poison frogs (superfamily Dendrobatoidea) in private U.S. collections during 1990–2020. For each species and color morph, there is a date of arrival, the way it arrived in U.S. collections, and detailed notes related to its presence in the pet trade.
keywords: pet trade; amphibians; Dendrobatidae
published: 2025-06-06
 
The materials used to provide Continuing Medical Education on ticks and tick-borne diseases in Illinois on February 1, 2023 at Carle Hospital, along with the pre- and post-quiz and deidentified data of the quiz takers. Files: "Ticks and Tick-borne Diseases of Illinois_Final_w_speaker_notes.pptx": Presentation slides used for CME course, with notes to indicate verbal commentary "CME assessment_final.docx": Pre- and post-CME quiz questions and answers, annotated to indicate correct answers and reasoning for incorrect answers "CME_prequiz_data_for_sharing.csv": De-identified data from pre-CME quiz "CME_postquiz_data_for_sharing.csv": De-identified data from post-CME quiz, including demographics "DataCleaning_forSharing.R": R file used to clean the raw data and calculate the scores "ReadMe.txt":
keywords: tick-borne disease; CME
published: 2025-06-03
 
GIS data and geoprocessing tools associated with White and Lambert (2025) modeling paper that assesses the potential impact of development on the archaeological resources of Illinois.
keywords: development; archaeology; climate change; GIS
published: 2025-06-04
 
These datasets contain the complete output from a Monte Carlo simulation of the number of wild cervids to test for chronic wasting disease (CWD) depending on true prevalence. Five CSVs of the simulation results are provided, split due to limitations in file size. The R code used to run the simulation and process the data is included. The data to replicated Table 1 and the data used to compare the simulation results to the CWD surveillance efforts of the Illinois Department of Natural Resources (IDNR) are also provided.
keywords: chronic wasting disease; cwd; cervid; test; sample size; diagnostic testing; surveillance
published: 2025-06-03
 
This is a peptide imaging data obtained by mtarix assisted laser desoption ionization trapped ion mobility datasets from the central nervous sytem and select ganglion of aplysia Californica.
keywords: Neuropeptides, Iosmerization, D-amino acids, MALDI-TIMS
published: 2019-06-13
 
This lexicon is the expanded/enhanced version of the Moral Foundation Dictionary created by Graham and colleagues (Graham et al., 2013). Our Enhanced Morality Lexicon (EML) contains a list of 4,636 morality related words. This lexicon was used in the following paper - please cite this paper if you use this resource in your work. Rezapour, R., Shah, S., & Diesner, J. (2019). Enhancing the measurement of social effects by capturing morality. Proceedings of the 10th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA). Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, MN. In addition, please consider citing the original MFD paper: <a href="https://doi.org/10.1016/B978-0-12-407236-7.00002-4">Graham, J., Haidt, J., Koleva, S., Motyl, M., Iyer, R., Wojcik, S. P., & Ditto, P. H. (2013). Moral foundations theory: The pragmatic validity of moral pluralism. In Advances in experimental social psychology (Vol. 47, pp. 55-130)</a>.
keywords: lexicon; morality
published: 2025-02-14
 
This dataset includes the original data (including photographs as .jpg files and sound recordings as .wav files) and detailed descriptions of workflows for analyses of acoustic and morphometric data for the Neoaliturus tenellus (beet leafhopper) species complex. Files needed for different parts of the two analytical workflows are included in the "Acoustics.zip" and "PCA.zip" archives. The "Folder Structure.png" file contains a diagram of the folder structure of the two archives. Each archive contains a "ReadMe" file with instructions for repeating the analyses. File and folder names including the two-letter abbreviations TB, TD, TN and TP refer to four different putative species (operational taxonomic units, or OTUs, of the Neoaliturus tenellus complex.
keywords: Hemiptera; Cicadellidae; integrative taxonomy; courtship; morphology
published: 2025-04-23
 
These data files were used for phylogenomic analyses of Darnini and related Membracidae (Hemiptera: Auchenorrhyncha) in the referenced article by Gonzalez-Mozo et al. - The "mem_50p_alignment.fas" file contains the aligned, concatenated nucleotide sequence data for 51 species and 492 genetic loci included in the phylogenetic analyses ("N" indicates missing data and "-" indicates an alignment gap). - The file "Table1.rtf" lists the included species, country of origin and genbank accession number. Species newly sequenced for this study have a Sample ID with prefix "DAR"; previously sequenced species for which data were downloaded from genbank have "NCBI" indicated in the same column of the table. - The file "partition_def.txt" lists the 492 genetic loci included in the alignment with their exact positions indicated by the range of numbers given at the end of each line (e.g., locus "uce-1" occupies positions 1-280 in the alignment). - The substitution model file "mem_50p.model" contains information on the substitution models used in the partitioned maximum likelihood analysis, including the models used for different data partitions and parameter values, as output by the phylogenetic software IQ-TREE. - Individual tree files in Newick format (plain text) are provided for the phylogeny from concatenated analysis with the best likelihood score ("mem_50p_bestLikelihoodScore"), concatenated likelihood analysis with gene concordance factors ("mem_50p_gcf") and site concordance factors ("mem_50p_scf"). - The tree file from the ASTRAL analysis is "mem_50p_astral". - The zip archive entitled “IQ-TREE analysis results.zip” includes output from the maximum likelihood analysis of the concatenated nucleotide sequence data, including the following: (1) main output file “mem_50p.iqtree” summarizing model selection, partitioning schemes, likelihood scores, and run parameters; (2) “mem_50p.mldist” including pairwise ML distances between taxa; (3) “mem_50p.best_scheme.nex” with the best partitioning scheme identified by ModelFinder in NEXUS format and (4) “mem_50p.best_scheme” the RAxM-compatible version of the same file. - The “Ultrafast bootstrap results.zip” zip archive contains: (1) “mem_50p.ufboot” with the bootstrap replicate trees; (2) “mem_50p.contree” with the majority-rule consensus tree with support values; (3) “mem_50p.splits.nex”, with split support values across the replicates; (4) “mem_50p.log” is the log file. - The “gene_trees.zip” zip archive contains the individual gene trees as input for subsequent coalescent gene tree analysis in the phylogenetic program ASTRAL. - The file "DarniniAHE_Character Matrix.csv" contains the data for 6 morphological characters for which the ancestral states were reconstructed using the phylogenetic results from analysis of anchored-hybrid data (see article text for details). - The file "scriptACRDarnini.txt" contains the commands used to reconstruct ancestral morphological characters states using the corHMM 2.8 R package. See the Methods section of the article for more details.
keywords: Insecta; Hemiptera; anchored-hybrid enrichment; phylogeny; treehopper
published: 2025-06-03
 
This data comprises image files used in the analysis of Analysis of Nematode Ventral Nerve Cords Suggests Multiple Instances of Evolutionary Addition and Loss of Neurons by Han et al. (bioRxiv, 2025: doi: https://doi.org/10.1101/2025.03.20.644414). It is separated into two folders. The first comprise data using DAPI staining to quantify the number of VNC nuclei in diverse nematodes. The second includes dye-filling data of Mononchus aquaticus.
keywords: C. elegans; Mononchus; neuroanatomy; nematode nervous system; ventral nerve cord; secondary simplification
published: 2025-03-13
 
ALMA Band 4 and 7 observations of the dust continuum in the Class 0 protostellar system L1448 IRS3B. We include the selfcal script, imaging scripts, fits files, and the python scripts for the figures in the paper.
keywords: ALMA; Band 4; Band 6; polarization; L1448 IRS3B
published: 2025-05-28
 
This dataset captures ‘Hype’ and 'Diversity', including article-level (pmid) and author-level (auid) data within biomedical abstracts sourced from PubMed. The selection chosen is ‘journal articles’ written in English, published between 1991 and 2014, totaling 421,580 (merged_df). The classification of hype relies on the presence of specific candidate ‘hype words’ and their abstract location. Therefore, each article (PMID) might have multiple instances in the dataset due to the presence of multiple hype words in different abstract sentences. Diversity is classified for ethnicity, gender, academic age, and topical expertise for authors based on the Rao-Sterling Diversity index. File1: merged_auids.csv (Important columns defined) • AUID: a unique ID for each author • Genni: gender prediction • Ethnea: ethnicity prediction ################################################# File2: merged_df.csv (Important columns defined) - pmid: unique paper - auid: all unique auids (author-name unique identification) - year: Year of paper publication - no_authors: Author count - journal: Journal name - years: first year of publication for every author - Country-temporal: Country of affiliation for every author - h_index: Journal h-index - TimeNovelty: Paper Time novelty - nih_funded: Binary variable indicating funding for any author - prior_cites_mean: Mean of all authors’ prior citation rate - insti_impact: All unique institutions’ citation rate - mesh_vals: Top MeSH values for every author of that paper - hype_word: Candidate hype word, such as ‘novel' - hype_value: Propensity of hype based on the hype word, the sentence, and the abstract location - hype_percentile: Abstract relative position of hype word - relative_citation_ratio: RCR
keywords: Hype; Diversity: PubMed; Abstracts; Scientometrics; Biomedicine
published: 2025-05-21
 
___________________________________SUMMARY This dataset contains derivative data from concurrent fMRI and scalp EEG recordings used in: Mostame Parham, Wirsich Jonathan, Alderson Thomas H, Ridley Ben, Giraud Anne-Lise, Carmichael David W, Vulliemoz Serge, Guye Maxime, Lemieux Louis, Sadaghiani Sepideh (2024) A multiplex of connectome trajectories enables several connectivity patterns in parallel eLife 13:RP98777. doi: https://doi.org/10.7554/eLife.98777.3 ___________________________________RAW DATA The data has been originally published and described as part of other studies (Morillon et al., 2010; Sadaghiani et al., 2012). Briefly, 10 minutes of eyes-closed resting state were analyzed from 26 healthy subjects (average age = 24.39 years; range: 18-31 years; 8 females) with no history of psychiatric or neurological disorders. Informed consent was given by each participant and the study was approved by the local Research Ethics Committee (CPP Ile de France III). FMRI was acquired using a 3T Siemens Tim Trio scanner with a GE-EPI pulse sequence (TR = 2 s; TE = 50 ms; 40 slices; 300 volumes; field of view: 192×192; voxel size: 3×3×3 mm3). Structural T1-weighted scan were acquired using the MPRAGE pulse sequence (176 slices; field of view: 256×256; voxel size: 1×1×1 mm3). 62-channel scalp EEG (Easycap, with an additional EOG and an ECG channel) was recorded using an MR-compatible amplifier (BrainAmp MR, Brain Products) at 5Hz sampling rate. ___________________________________PREPROCESSING fMRI and EEG data were preprocessed with standard preprocessing steps as explained in detail elsewhere (Wirsich et al., 2020). In brief, fMRI underwent standard slice-time correction, spatial realignment (SPM12, http://www.fil.ion.ucl.ac.uk/spm/software/spm12). Structural T1-weighted images were processed using Freesurfer (recon-all, v6.0.0, https://surfer.nmr.mgh.harvard.edu/) in order to perform non-uniformity and intensity correction, skull stripping and gray/white matter segmentation. The cortex was parcellated into 68 regions of the Desikan-Kiliany atlas (Desikan et al., 2006). This atlas was chosen because —as an anatomical parcellation— avoids biases towards one or the other functional data modality. The T1 images of each subject and the Desikan-Killiany were co-registered to the fMRI images (FSL-FLIRT 6.0.2, https://fsl.fmrib.ox.ac.uk/fsl/fslwiki). We extracted signals of no interest such as the average signals of cerebrospinal fluid (CSF) and white matter from manually defined regions of interest (ROI, 5 mm sphere, Marsbar Toolbox 0.44, http://marsbar.sourceforge.net) and regressed out of the BOLD timeseries along with 6 rotation, translation motion parameters and global gray matter signal (Wirsich et al., 2017a). Then we bandpass-filtered the timeseries at 0.009–0.08 Hz. Average timeseries of each region was then used to calculate connectivity. EEG underwent gradient and cardio-ballistic artifact removal using Brain Vision Analyzer software (Allen et al., 1998, 2000) and was down-sampled to 250 Hz. EEG was projected into source space using the Tikhonov-regularized minimum norm in Brainstorm software (Baillet et al., 2001; Tadel et al., 2011). Source activity was then averaged to the 68 regions of the Desikan-Killiany atlas. Band-limited EEG signals in each canonical frequency band and every atlas region were then used to calculate frequency-specific connectome dynamics. Note that the MEG-ROI-nets toolbox in the OHBA Software Library (OSL; https://ohba-analysis.github.io/osl-docs/) was used to minimize source leakage in the band-limited source-localized EEG data (Colclough et al., 2015). ___________________________________FOLDER STRUCTURE The dataset includes five separate folders as described below: 1) EEGfMRI_dFC folder: connectome dynamics of scalp data This folder contains 26 single MATLAB (.mat) files for each subject. Inside each `.mat` is a structure with fields `A`, `B`, and `C`, corresponding to fMRI, amplitude-coupling, and phase-coupling connectome dynamics, respectively. The fMRI data are 3-dimensional (ROI × ROI × timepoints). The EEG data are stored in a 1×5 cell array (Delta, Theta, Alpha, Beta, Gamma), each cell containing a 3-D ROI × ROI × timepoints matrix. 2) EEGfMRI_dFC_SourceOrtho foldeR: connectome dynamics of source-orthogonalized scalp data Same format as above, except that EEG connectome dynamics are derived from source-orthogonalized signals. The MEG-ROI-nets toolbox in the OHBA Software Library (OSL; https://ohba-analysis.github.io/osl-docs/) was used to minimize source leakage in the band-limited, source-localized EEG data (Colclough et al., 2015). 3-5) Cross-modal Recurrence Plot (CRP) data Each subject has an Excel file with five sheets (Delta through Gamma), corresponding to the five frequency bands. Each sheet contains a 2-D CRP matrix (rows = fMRI timepoints, columns = band-limited EEG timepoints). - Scalp EEG–fMRI CRPs (CRP_EEGfMRI and CRP_EEGfMRI_SourceOrtho folder): two versions (with and without source-orthogonalization), each has 52 Excel files, including amplitude- and phase-coupling CRPs. - Intracranial EEG–fMRI CRPs (CRP_iEEGfMRI folder): one version, 27 Excel files, containing three cases: amplitude coupling, HRF-convolved amplitude coupling, and phase coupling.
keywords: Connectome; fMRI-EEG; Intracranial; Multiplex
published: 2018-04-19
 
MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik 2018-04-05 The dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed. &bull; How was the dataset created? The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016. Check here for information to get PubMed/MEDLINE, and NLMs data <a href ="https://www.nlm.nih.gov/databases/download/pubmed_medline.html">Terms and Conditions</a> &bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only. However, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220). &bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records. &bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in: <i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i> &bull; Look for <a href="https://doi.org/10.1186/s41182-017-0073-6">Fig. 4</a> in the following article for coverage statistics over time: <i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i> Expect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014. &bull; The code and back-end data is periodically updated and made available for query by PMID at <a href="http://abel.ischool.illinois.edu/">Torvik Research Group</a> &bull; What is the format of the dataset? The dataset contains 37,406,692 rows. Each row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1. 1. PMID: positive non-zero integer; int(10) unsigned 2. au_order: positive non-zero integer; smallint(4) 3. lastname: varchar(80) 4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed 5. year of publication: 6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK 7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|' 8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA) 9. country 10. journal 11. lat: at most 3 decimals (only available when city is not a country or state) 12. lon: at most 3 decimals (only available when city is not a country or state) 13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find
keywords: PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution
Research Data Service Illinois Data Bank
Access and Use Policies Web Privacy Notice Contact Us