Illinois Data Bank Dataset Search Results
Results
published:
2022-10-22
Madhavan, Vidya; Aishwarya, Anuva
(2022)
This dataset consists of all the files that are part of the manuscript titled "Evidence for a robust sign-changing s-wave order parameter in monolayer films of superconducting Fe(Se,Te)/Bi2Te3". For detailed information on the individual files refer to the readme file.
keywords:
thin film; mbe; topology; superconductivity; topological insulator; stm; spectroscopy; qpi
published:
2023-10-16
Rasoarimanana, Tantely; Edmonds, Devin; Marquis, Olivier
(2023)
This dataset provides microhabitat and environmental variables collected in the habitat of the poison frog Mantella baroni from 155 1-meter square quadrats in Vohimana Reserve along forest valleys, on slopes, and on ridgelines. We also provide data from photographic capture-recapture surveys used for estimating abundance.
keywords:
occupancy; abundance; amphibian; Madagascar; microhabitat; capture-recapture
published:
2025-08-28
Purba, Denissa Sari Darmawi; Pei, Xingrui; Kontou, Eleftheria
(2025)
This dataset contains both processed and raw data that were leveraged to conduct analysis presented fully in the report "Community Vulnerability Assessment for Electric Vehicle Travelers Responsive to Extreme Flooding" and partially in the under review paper "Vulnerability Assessment of Electric Vehicles and their Charging Station Network during Evacuations".
keywords:
electric vehicles; vulnerability assessment; flooding events; evacuation; charging infrastructure
published:
2024-08-29
Li, Shuai; Montes, Christopher; Aspray, Elise; Ainsworth, Elizabeth
(2024)
Over the past 15 years, soybean seed yield response to season-long elevated O3 concentrations [O3] and to year-to-year weather conditions was studied using free-air O3 concentration enrichment (O3-FACE) in the field at the SoyFACE facility in Central Illinois. Elevated [O3] significantly reduced seed yield across cultivars and years. However, our results quantitatively demonstrate that weather conditions, including soil water availability and air temperature, did not alter yield sensitivity to elevated [O3] in soybean.
keywords:
drought, elevated O3, heat, O3-FACE, soybean, yield
published:
2021-06-08
Todd, Jones; Michael, Ward
(2021)
Dataset associated with Jones and Ward JAE-2020-0031.R1 submission: Pre-to post-fledging carryover effects and the adaptive significance of variation in wing development for juvenile songbirds. Excel CSV files with data used in analyses and file with descriptions of each column. The flight ability variable in this dataset was derived from fledgling drop tests, examples of which can be found in the related dataset: Jones, Todd M.; Benson, Thomas J.; Ward, Michael P. (2019): Flight Ability of Juvenile Songbirds at Fledgling: Examples of Fledgling Drop Tests. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2044905_V1.
keywords:
fledgling; wing development; life history; adaptive significance; post-fledging; songbirds
published:
2021-10-15
Perez, Sierra; Dalling, James; Fraterrigo, Jennifer
(2021)
Information on the location, dimensions, time of treefall or death, decay state, wood nutrient, wood pH and wood density data, and soil moisture, slope, distance from forest edge and soil nutrient data associated with the publication "Interspecific wood trait variation predicts decreased carbon residence time in changing forests" authored by Sierra Perez, Jennifer Fraterrigo, and James Dalling.
** <b>Note:</b> Blank cells indicate that no data were collected.
keywords:
wood decay; carbon residence time; coarse woody debris; decomposition, temperate forests
published:
2022-12-05
Ng, Yee Man Margaret ; Taneja, Harsh
(2022)
These are similarity matrices of countries based on dfferent modalities of web use. Alexa website traffic, trending vidoes on Youtube and Twitter trends. Each matrix is a month of data aggregated
keywords:
Global Internet Use
published:
2017-03-02
This data was collected between 2004 and 2010 at White River National Wildlife Refuge (WRNWR) and Saint Francis National Forest (SF). It was collected as part of two master’s and one PhD project at Arkansas State University USA studying Swainson’s Warbler habitat use, survival, and body condition.
keywords:
Swainson’s Warbler; Limnothlypis swainsonii; flooding; natural disturbance; apparent survival; body condition
published:
2023-05-08
Dataset for Food availability influences angling vulnerability in muskellunge
published:
2024-05-13
Gopalakrishnappa, Chandana; Li, Zeqian; Kuehn, Seppe
(2024)
Supplemental data for the paper titled 'Environmental modulators of algae-bacteria interactions at scale'. Each of the excel workbooks corresponding to datasets 1, 2, and 3 contain a README sheet explaining the reported data. Dataset 4 comprising microscopy data contains a README text file describing the image files.
keywords:
Algae-bacteria interactions; high-throughput; microfluidic-droplet platform
published:
2024-07-29
Caetano Machado Lopes, Lorran; Chacko, George
(2024)
This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15.
The dataset comprises two compressed (.xz) files.
1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types:
• openalex_id: A unique identifier from the Open Alex catalog.
• integer_id: An integer representing the new identifier (assigned by the authors)
• hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes).
2) filename: citation_table.tsv.xz
This edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively.
Summary Features
• Total Nodes (Documents): 256,997,006
• Total Edges (citations): 2,148,871,058
• Documents with DOIs: 163,495,446
• Edges between documents with DOIs: 1,936,722,541 [corrected to 2,148,788,148 edges Nov 13, 2025]
• Count of unique nodes in edgelist 111,453,719 [updated Nov 13, 2025]
Note: Nov 13, 2025. An improved curation process will be applied to a future version of this dataset
Note: Nov 13, 2025.
The code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/
keywords:
citation networks; Open Alex
published:
2019-09-17
Mishra, Shubhanshu
(2019)
Trained models for multi-task multi-dataset learning for sequence tagging in tweets.
Sequence tagging tasks include POS, NER, Chunking, and SuperSenseTagging.
Models were trained using: <a href="https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_experiment.py">https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_experiment.py</a>
See <a href="https://github.com/socialmediaie/SocialMediaIE">https://github.com/socialmediaie/SocialMediaIE</a> and <a href="https://socialmediaie.github.io">https://socialmediaie.github.io</a> for details.
If you are using this data, please also cite the related article:
Shubhanshu Mishra. 2019. Multi-dataset-multi-task Neural Sequence Tagging for Information Extraction from Tweets. In Proceedings of the 30th ACM Conference on Hypertext and Social Media (HT '19). ACM, New York, NY, USA, 283-284. DOI: https://doi.org/10.1145/3342220.3344929
keywords:
twitter; deep learning; machine learning; trained models; multi-task learning; multi-dataset learning;
published:
2022-11-07
Jones, Todd; Di Giovanni, Alexander; Hauber, Mark; Ward, Michael
(2022)
Dataset associated with Jones et al. ECY22-0118.R3 submission: Ontogenetic effects of brood parasitism by the Brown-headed Cowbird on host offspring. Excel CSV files with all of the data used in analyses and file with descriptions of each column.
keywords:
brood parasitism; cowbirds; host-parasite systems; ontogeny; post-fledging; songbirds
published:
2022-12-31
Maffeo, Christopher; Wilson, Jim; Quednau, Lauren; Aksimentiev, Aleksei
(2022)
Trajectory data for Nature Nanotechnology manuscript "DNA double helix, a tiny electromotor" that demonstrates how an electric field applied along the helical axis of a DNA or RNA molecule will generate an electroosmotic flow that causes the duplex to spin about that axis, much like a turbine.
keywords:
All-atom MD simulation; DNA; nanotechnology; motors and rotors
published:
2024-02-26
Harsh, Vipul; Zhou, Wenxuan; Ashok, Sachin; Mysore, Radhika Niranjan; Godfrey, Brighten; Banerjee, Sujata
(2024)
Traces created using DeathStarBench (https://github.com/delimitrou/DeathStarBench) benchmark of microservice applications with injected failures on containers. Failures consist of disk/CPU/memory failures.
keywords:
Murphy;Performance Diagnosis;Microservice;Failures
published:
2024-12-11
MMAudio pretrained models. These models can be used in the open-sourced codebase https://github.com/hkchengrex/MMAudio
<b>Note:</b> mmaudio_large_44k_v2.pth and Readme.txt are added to this V2. Other 4 files stay the same.
published:
2021-05-07
Prepared by Vetle Torvik 2021-05-07
The dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters).
• How was the dataset created?
The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in December, 2018. (NLMs baseline 2018 plus updates throughout 2018). Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only. However, MapAffil 2018 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220). Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records. All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:
Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p
• Look for Fig. 4 in the following article for coverage statistics over time:
Palmblad, M., Torvik, V.I. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Trop Med Health 45, 33 (2017). <a href="https://doi.org/10.1186/s41182-017-0073-6">https://doi.org/10.1186/s41182-017-0073-6</a>
Expect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.
• The code and back-end data is periodically updated and made available for query by PMID at http://abel.ischool.illinois.edu/cgi-bin/mapaffil/search.py
• What is the format of the dataset?
The dataset contains 52,931,957 rows (plus a header row). Each row (line) in the file has a unique PMID and author order, and contains the following eighteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1.
1. PMID: positive non-zero integer; int(10) unsigned
2. au_order: positive non-zero integer; smallint(4)
3. lastname: varchar(80)
4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed
5. initial_2: middle name initial
6. orcid: From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML
7. year: year of the publication
8. journal: name of journal that the publication is published
9. affiliation: author's affiliation??
10. disciplines: extracted from departments, divisions, schools, laboratories, centers, etc. that occur on at least unique 100 affiliations across the dataset, some with standardization (e.g., 1770799), English translations (e.g., 2314876), or spelling corrections (e.g., 1291843)
11. grid: inferred using a high-recall technique focused on educational institutions (but, for experimental purposes, includes a few select hospitals, national institutes/centers, international companies, governmental agencies, and 200+ other IDs [RINGGOLD, Wikidata, ISNI, VIAF, http] for institutions not in GRID). Based on 2019 GRID version https://www.grid.ac/
12. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK
13. city: varchar(200); typically 'city, state, country' but could include further subdivisions; unresolved ambiguities are concatenated by '|'
14. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)
15. country
16. lat: at most 3 decimals (only available when city is not a country or state)
17. lon: at most 3 decimals (only available when city is not a country or state)
18. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find
keywords:
PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution; institution name disambiguation
published:
2021-05-14
This is the complete dataset for the "Anomalous density fluctuations in a strange metal" Proceedings of the National Academy of Sciences publication (https://doi.org/10.1073/pnas.1721495115). This is an integration of the Zenodo dataset which includes raw M-EELS data.
<b>METHODOLOGICAL INFORMATION</b>
1. Description of methods used for collection/generation of data: Data have been collected with a M-EELS instrument and according to the data acquisition protocol described in the original PNAS publication and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026)
2. Methods for processing the data: Raw data were collected with a channeltron-based M-EELS apparatus described in the reference PNAS publication and analyzed according to the procedure outlined both in the PNAS paper and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026). The raw M-EELS spectra at each momentum have been subject to minor data processing involving:
(a) averaging of different acquisitions at the same conditions,
(b) energy binning,
(c) division of an effective Coulomb matrix element (which yields a structure factor S(q,\omega)),
(d) antisymmetrization (which yields the imaginary chi)
All these procedures are described in the PNAS paper.
3. Instrument- or software-specific information needed to interpret the data: These data are simple .txt or .dat files which can be read with any standard data analysis software, notably Python notebooks, MatLab, Origin, IgorPro, and others. We do not include scripts in order to provide maximum flexibility.
4. Relationship between files, if important: We divided in different folders raw data, structure factors and imaginary chi.
<b>DATA-SPECIFIC INFORMATION</b>
There are 8 folders within the Data_public_deposition_v1.zip. Each folder contain data needed to create the corresponding figure in the publication.
<b>1. Fig1:</b> This folder contains 21 DAT files needed to plot the theory data in panels C and D, following this naming conventions:
[chiA]or[chiB]or[Pi]_q_number.dat
With chiA is the imaginary RPA charge susceptibility with a Coulomb interaction of electronically weakly coupled layers
chiB is the imaginary RPA charge susceptibility with the usual 4\pi e^2/q^2 Coulomb interaction.
Pi is the imaginary Lindhard polarizability.
q is momentum in reciprocal lattice units
Number is the numerical momentum value in reciprocal lattice units
<b>2. Fig2:</b> Files needed to plot Fig. 2 of the PNAS paper. Contains 3 folders as listed below. The files in this folder are named following this convention: Bi2212_295K_(1,-1)_50eV_161107_q_number_2.16_avg.dat,
295K is the sample temperature
(1,-1) is the momentum direction in reciprocal lattice units
50 eV is the incident e beam energy
161107 is the start date of the experiment in yymmdd format
Q is the momentum
Number is the momentum in reciprocal lattice units
2.16 is the energy range covered by the data in eV
Avg identifies averaged data
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>3. Fig3:</b> Files needed to plot Fig. 3 of the PNAS paper. OP/ OD prefix identifies optimally doped or overdosed sample data, respectively.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>4. Fig4:</b> Files needed to plot Fig. 4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>5. FigS1:</b> Files needed to plot Fig. S1 of the PNAS paper. There are 5 files in this folder. DAT files are M-EELS data following the prior naming convention, while the two .txt files are digitized data from N. Nücker, U. Eckern, J. Fink, and P. Müller, Long-Wavelength Collective Excitations of Charge Carriers in High-Tc Superconductors, Phys. Rev. B 44, 7155(R) (1991), and K. H. G. Schulte, The interplay of Spectroscopy and Correlated Materials, Ph.D. thesis, University of Groningen (2002).
<b>6. FigS2:</b> Files needed to plot Fig. S2 of the PNAS paper.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>7. FigS3:</b> Files needed to plot Fig. S3 of the PNAS paper. There are 2 files in this folder:
20K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 20 K
295K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 295 K
<b>8. FigS4:</b> Files needed to plot Fig. S4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
keywords:
Momentum resolved electron energy loss spectroscopy (M-EELS); cuprates; plasmons; strange metal
published:
2022-09-29
Merrill, Loren; Jones, Todd; Brawn, Jeffrey; Ward, Michael
(2022)
Dataset associated with Merrill et al. ECE-2021-05-00793.R1 submission: Early life patterns of growth are linked to levels of phenotypic trait covariance and post-fledging mortality across avian species. Excel CSV files with all of the data used in analyses and file with descriptions of each column.
keywords:
canalization; developmental flexibility; early-life stress; nest predation; phenotypic correlation; trait covariance
published:
2025-02-14
Sinaiko, Guy; Dietrich, Christopher
(2025)
This dataset includes the original data (including photographs as .jpg files and sound recordings as .wav files) and detailed descriptions of workflows for analyses of acoustic and morphometric data for the Neoaliturus tenellus (beet leafhopper) species complex. Files needed for different parts of the two analytical workflows are included in the "Acoustics.zip" and "PCA.zip" archives. The "Folder Structure.png" file contains a diagram of the folder structure of the two archives. Each archive contains a "ReadMe" file with instructions for repeating the analyses. File and folder names including the two-letter abbreviations TB, TD, TN and TP refer to four different putative species (operational taxonomic units, or OTUs, of the Neoaliturus tenellus complex.
keywords:
Hemiptera; Cicadellidae; integrative taxonomy; courtship; morphology
published:
2025-10-14
Jagtap, Sujit Sadashiv; Deewan, Anshu; Liu, Jing-Jing; Walukiewicz, Hanna E.; Yun, Eun Ju; Jin, Yong-Su; Rao, Christopher V.
(2025)
Rhodosporidium toruloides is an oleaginous yeast capable of producing a variety of biofuels and bioproducts from diverse carbon sources. Despite numerous studies showing its promise as a platform microorganism, little is known about its metabolism and physiology. In this work, we investigated the central carbon metabolism in R. toruloides IFO0880 using transcriptomics and metabolomics during growth on glucose, xylose, acetate, or soybean oil. These substrates were chosen because they can be derived from plants. Significant changes in gene expression and metabolite concentrations were observed during growth on these four substrates. We mapped these changes onto the governing metabolic pathways to better understand how R. toruloides reprograms its metabolism to enable growth on these substrates. One notable finding concerns xylose metabolism, where poor expression of xylulokinase induces a bypass leading to arabitol production. Collectively, these results further our understanding of central carbon metabolism in R. toruloides during growth on different substrates. They may also help guide the metabolic engineering and development of better models of metabolism for R. toruloides.
keywords:
Conversion;Metabolomics;Transcriptomics
published:
2020-12-02
Yang, Pan; Cai, Ximing; Khanna, Madhu
(2020)
The dataset includes the survey results about farmers’ perceptions of marginal land availability and the likelihood of a land pixel being marginal based on a machine learning model trained from the survey.
Two spreadsheet files are the farmer and farm characteristics (marginal_land_survey_data_shared.xlsx), and the existing land use of marginal lands (land_use_info_sharing.xlsx).
<b>Note:</b> the blank cells in these two spreadsheets mean missing values in the survey response.
The GeoTiff file includes two bands, one the marginal land likelihood in the Midwestern states (0-1), the other the dominant reason of land marginality (0-5; 0 for farm size, 1 for growing season precipitation, 2 for root zone soil water capacity, 3 for average slope, 4 for growing season mean temperature, and 5 for growing season diurnal range of temperature). To read the data, please use a GIS software such as ArcGIS or QGIS.
keywords:
marginal land; survey
published:
2022-10-14
Zhou, Shan; Li, Jiahui; Lu, Jun; Liu, Haihua; Kim, Ji-Young; Kim, Ahyoung; Yao, Lehan; Liu, Chang; Qian, Chang; Hood, Zachary D. ; Lin, Xiaoying; Chen, Wenxiang; Gage, Thomas E. ; Arslan, Ilke; Travesset, Alex; Sun, Kai; Kotov, Nicholas A.; Chen, Qian
(2022)
This dataset is the raw data including SEM, TEM, PINEM images and FDTD simulation as well as pairwise interaction calculation results.
published:
2022-08-23
Seyfried, Georgia; Corrales, Adriana; Kent, Angela; Dalling, James; Yang, Wendy
(2022)
This dataset contains soil chemical properties used to variation in soil fungal communities beneath Oreomunnea mexicana trees in the manuscript "Watershed-scale variation in potential fungal community contributions to ectomycorrhizal biogeochemical syndromes"
keywords:
Acid-base chemistry; Ectomycorrhizal fungi; Exploration type; Nitrogen cycling; Nitrogen isotopes; Plant-soil (below-ground) interactions; Saprotrophic fungi; Tropical forest
published:
2022-08-25
Souza-Cole, Ian; Ward, Michael; Rebecca, Mau; Jeffrey, Foster; Benson, Thomas
(2022)
Data in this publication were used to analyze the factors that influence the abundance of eastern whip-poor-wills in the Midwest and to describe the diet of this species. These data were collected in Illinois in 2019 and 2020. Procedures were approved by the Illinois Institutional Animal Care and Use Committee (IACUC), protocol no. 19006
keywords:
eastern whip-poor-will; Antrostomus vociferus; abundance; moths; nightjars; Lepidoptera; metabarcoding