Illinois Data Bank Dataset Search Results
Results
published:
2024-02-15
Hoggatt, Meredith; Starbuck, Clarissa; O'Keefe, Joy
(2024)
Dataset includes the dataset for estimating bat density from acoustic data and the R code. The data support a publication by Meredith L. Hoggatt, Clarissa A. Starbuck, and Joy M. O'Keefe entitled Acoustic monitoring yields informative bat population density estimates.
keywords:
acoustics; bats; monitoring; population density; random encounter model
published:
2024-08-11
Curtis, Jeffrey H.; Riemer, Nicole; West, Matthew
(2024)
This dataset contains all material required to produce the figures found within the manuscript submitted to Geoscientific Model Development entitled “Explicit stochastic advection algorithms for the regional scale particle-resolved atmospheric aerosol model WRF-PartMC (v1.0)”. The dataset consists of Python Jupyter notebooks and any applicable WRF-PartMC output. This dataset covers the three numerical examples of the manuscript, 1D advection by a uniform constant wind, a 2D rotational flow and a 3D time-evolving WRF simulated flow.
keywords:
Atmospheric chemistry; Atmospheric Science; Particle-resolved modeling; Numerical modeling; Advection;
published:
2018-06-05
Soliman, Aiman; Mackay, Andrew; Schmidt , Arthur; Allan, Brian; Wang, Shaowen
(2018)
A complete building coverage area dataset (i.e. area occupied by building structures, excluding other built surfaces such as roads, parking lots, and public parks) at the level of census block groups for the contiguous United States (CONUS). The dataset was assembled based on an ensemble prediction of nonlinear hierarchical models to account for spatial heterogeneities in the distribution of built surfaces across different urban communities. Percentage of impervious land and housing density were used as predictors of the estimated area of buildings and cross-validation results showed that the product estimated area represented by buildings with a mean error of 0.049 %.
keywords:
Building Coverage Area; Urban Geography; Regional; Sustainability; US Census Block Groups; CONUS Data
published:
2019-02-22
Fernández, Roberto; Parker, Gary; Stark, Colin
(2019)
This dataset includes measurements taken during the experiments on patterns of alluvial cover over bedrock. The dataset includes an hour worth of timelapse images taken every 10s for eight different experimental conditions. It also includes the instantaneous water surface elevations measured with eTapes at a frequency of 10Hz for each experiment. The 'Read me Data.txt' file explains in more detail the contents of the dataset.
keywords:
bedrock; erosion; alluvial; meandering; alluvial cover; sinuosity; flume; experiments; abrasion;
published:
2020-11-18
Gardner, Allison; Allan, Brian
(2020)
These data obtained from the peer-reviewed literature and a public database depict the geographic expansion of the black-legged tick (Ixodes scapularis) and human cases of Lyme disease in the midwestern U.S.
<b><i>Note</b></i>: There was an omission from the first version (V1) of the data set that required us to update the data. Specifically, we failed to include the data from the article "Caporale DA, Johnson CM, Millard BJ. 2005 Presence of Borrelia burgdorferi (Spirochaetales: Spirochaetaceae) in Southern Kettle Moraine State Forest, Wisconsin, and characterization of strain W97F51. J. Med. Entomol. 42, 457–472". In the second version (V2) of the data, this omission is corrected.
keywords:
Lyme disease; Borrelia burgdorferi; Ixodes scapularis; black-legged tick
published:
2022-11-28
Avrin, Alexandra; Pekins, Charles; Wilmers, Christopher; Sperry, Jinelle; Allen, Maximilian
(2022)
Detection data of carnivores and their prey species from camera traps in Fort Hood, Texas and Santa Cruz, California, USA. Non-carnivore and non-prey species (humans, domestic species, avian species, etc.) were excluded from this dataset. All detections of each species at a camera within 30 minutes have been combined to 1 detection (only first detection within that 30 minutes kept) to avoid pseudoreplication.
Variable Description:
Site= Study area data were collected
MonitoringPeriod= year in which data was collected (data were collected at each location over multiple monitoring periods)
CameraName= Unique name for each camera location
Date= calendar date of detection
Time= time of detection
-Fort Hood= Central Time USA
-Santa Cruz= Pacific Time USA
Species= Common name of species detected
keywords:
carnivore; community ecology; competition; interspecific interactions; keystone species; mesopredator; predation; trophic cascade
published:
2023-04-02
Lee, Yuanyao; Khanna, Madhu; Chen, Luoye
(2023)
Use of cellulosic biofuels from non-feedstocks are modeled using the BEPAM (Biofuel and Environmental Policy Analysis Model) model to quantifying the uncertainties about induced land use change effects, net greenhouse gas saving potential, and economic costs. The code is in GAMS, general algebraic modeling language.
NOTE: Column 3 is titled "BAU" in "merged_BAU.gdx", "merged_RFS.gdx", and "merged_CEM.gdx", but contains "RFS" data in "merged_RFS.gdx" and "CEM" data in "merged_CEM.gdx".
keywords:
cellulosic biomass; BEPAM; economic modeling
published:
2024-08-02
Morrow Plots Data Curation Working Group
(2024)
The Morrow Plots at the University of Illinois at Urbana-Champaign are the longest-running continuous experimental plots in the Americas. In continuous operation since 1876, the plots were established to explore the impact of crop rotation and soil treatment on corn crop yields. In 2018, The Morrow Plots Data Curation Working Group began to identify, collect and curate the various data records created over the history of the experiment. The resulting data table published here includes planting, treatment and yield data for the Morrow Plots since 1888. Please see the included codebook for a detailed explanation of the data sources and their content. This dataset will be updated as new yield data becomes available.
*NOTE: While digitized and accessed through IDEALS, the physical copy of the field notebook: <a href="https://archon.library.illinois.edu/archives/index.php?p=collections/controlcard&id=11846">Morrow Plots Notebook, 1876-1913, 1967</a> is also held at the University of Illinois Archives.
keywords:
Corn; Crop Science; Experimental Fields; Crop Yields; Agriculture; Illinois; Morrow Plots
published:
2025-11-20
Ahmed, Md Wadud; Esquerre, Carlos A.; Eilts, Kristen; Allen, Dylan P.; McCoy, Scott M.; Varela, Sebastian; Singh, Vijay; Leakey, Andrew; Kamruzzaman, Mohammad
(2025)
NIR spectroscopy is a rapid and accurate green technology for high-throughput biomass characterization, including sorghum (Sorghum bicolor), a promising energy crop for the biofuel industry. This study assessed the influence of particle size on NIR spectroscopic analysis (wavelength range: 867–2535 nm) of sorghum biomass composition. Grown under field conditions, a total of 113 types of genetically diverse sorghum accessions were dried, ground, and sieved (<250, 250–600, 600–850, and > 850 µm particle size) for developing partial least square regression (PLSR) prediction models for moisture, ash, extractive, glucan, xylan, acid-soluble lignin (ASL), acid-insoluble lignin (AIL), and total lignin (ASL + AIL). Overall, smaller particle sizes provided better model performance, while no single particle size provided the best performance for all the selected components. With only 9 selected bands and 4 latent variables (LVs), the best PLSR model was obtained for moisture with particle size of 600–850 µm with the square root of the coefficient of determination (R) of 0.85, the ratio of prediction to deviation (RPD) of 2.2, and the root mean square error (RMSE) of 0.46 % in external validation. Similar model performances were also obtained for ash, extractive, glucan, and xylan. This study showed that size reduction could effectively improve NIR spectroscopic analysis for lipid-producing sorghum biomass for the biofuel industry.
keywords:
Conversion;Feedstock Production;Biomass Analytics;Modeling;Sorghum
published:
2016-12-20
Wickes, Elizabeth; Nakamura, Katia
(2016)
Scripts and example data for AIDData (aiddata.org) processing in support of forthcoming Nakamura dissertation.
This dataset includes two sets of scripts and example data files from an aiddata.org data dump. Fuller documentation about the functionality for these scripts is within the readme file. Additional background information and description of usage will be in the forthcoming Nakamura dissertation (link will be added when available). Data originally supplied by Nakamura. Python code and this readme file created by Wickes. Data included within this deposit are examples to demonstrate execution.
Roughly, there are two python scripts in here: keyword_search.py, designed to assist in finding records matching specific keywords, and matching_tool.ipynb, designed to assist in detection of which records are and are not contained within a keyword results file and an aiddata project data file.
keywords:
aiddata; natural resources
published:
2020-07-16
Mishra, Shubhanshu
(2020)
Dataset to be for SocialMediaIE tutorial
keywords:
social media; deep learning; natural language processing
published:
2020-08-22
Qiu, Haoran; Banerjee, Subho S.; Jha, Saurabh; Kalbarczyk, Zbigniew T.; Iyer, Ravishankar K.
(2020)
We are releasing the tracing dataset of four microservice benchmarks deployed on our dedicated Kubernetes cluster consisting of 15 heterogeneous nodes. The dataset is not sampled and is from selected types of requests in each benchmark, i.e., compose-posts in the social network application, compose-reviews in the media service application, book-rooms in the hotel reservation application, and reserve-tickets in the train ticket booking application.
The four microservice applications come from [DeathStarBench](https://github.com/delimitrou/DeathStarBench) and [Train-Ticket](https://github.com/FudanSELab/train-ticket). The performance anomaly injector is from [FIRM](https://gitlab.engr.illinois.edu/DEPEND/firm.git).
The dataset was preprocessed from the raw data generated in FIRM's tracing system. The dataset is separated by on which microservice component is the performance anomaly located (as the file name suggests). Each dataset is in CSV format and fields are separated by commas. Each line consists of the tracing ID and the duration (in 10^(-3) ms) of each component. Execution paths are specified in `execution_paths.txt` in each directory.
keywords:
Microservices; Tracing; Performance
published:
2021-07-22
Hsiao, Tzu-Kun; Schneider, Jodi
(2021)
This dataset includes five files. Descriptions of the files are given as follows:
<b>FILENAME: PubMed_retracted_publication_full_v3.tsv</b>
- Bibliographic data of retracted papers indexed in PubMed (retrieved on August 20, 2020, searched with the query "retracted publication" [PT] ).
- Except for the information in the "cited_by" column, all the data is from PubMed.
- PMIDs in the "cited_by" column that meet either of the two conditions below have been excluded from analyses:
[1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file).
[2] Citing paper and the cited retracted paper have the same PMID.
ROW EXPLANATIONS
- Each row is a retracted paper. There are 7,813 retracted papers.
COLUMN HEADER EXPLANATIONS
1) PMID - PubMed ID
2) Title - Paper title
3) Authors - Author names
4) Citation - Bibliographic information of the paper
5) First Author - First author's name
6) Journal/Book - Publication name
7) Publication Year
8) Create Date - The date the record was added to the PubMed database
9) PMCID - PubMed Central ID (if applicable, otherwise blank)
10) NIHMS ID - NIH Manuscript Submission ID (if applicable, otherwise blank)
11) DOI - Digital object identifier (if applicable, otherwise blank)
12) retracted_in - Information of retraction notice (given by PubMed)
13) retracted_yr - Retraction year identified from "retracted_in" (if applicable, otherwise blank)
14) cited_by - PMIDs of the citing papers. (if applicable, otherwise blank) Data collected from iCite.
15) retraction_notice_pmid - PMID of the retraction notice (if applicable, otherwise blank)
<b>FILENAME: PubMed_retracted_publication_CitCntxt_withYR_v3.tsv</b>
- This file contains citation contexts (i.e., citing sentences) where the retracted papers were cited. The citation contexts were identified from the XML version of PubMed Central open access (PMCOA) articles.
- This is part of the data from: Hsiao, T.-K., & Torvik, V. I. (manuscript in preparation). Citation contexts identified from PubMed Central open access articles: A resource for text mining and citation analysis.
- Citation contexts that meet either of the two conditions below have been excluded from analyses:
[1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file).
[2] Citing paper and the cited retracted paper have the same PMID.
ROW EXPLANATIONS
- Each row is a citation context associated with one retracted paper that's cited.
- In the manuscript, we count each citation context once, even if it cites multiple retracted papers.
COLUMN HEADER EXPLANATIONS
1) pmcid - PubMed Central ID of the citing paper
2) pmid - PubMed ID of the citing paper
3) year - Publication year of the citing paper
4) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, tbl_fig_caption = tables and table/figure captions)
5) IMRaD - IMRaD section of the citation context (I = Introduction, M = Methods, R = Results, D = Discussions/Conclusion, NoIMRaD = not identified)
6) sentence_id - The ID of the citation context in a given location. For location information, please see column 4. The first sentence in the location gets the ID 1, and subsequent sentences are numbered consecutively.
7) total_sentences - Total number of sentences in a given location
8) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper.
9) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper.
10) citation - The citation context
11) progression - Position of a citation context by centile within the citing paper.
12) retracted_yr - Retraction year of the retracted paper
13) post_retraction - 0 = not post-retraction citation; 1 = post-retraction citation. A post-retraction citation is a citation made after the calendar year of retraction.
<b>FILENAME: 724_knowingly_post_retraction_cit.csv</b> (updated)
- The 724 post-retraction citation contexts that we determined knowingly cited the 7,813 retracted papers in "PubMed_retracted_publication_full_v3.tsv".
- Two citation contexts from retraction notices have been excluded from analyses.
ROW EXPLANATIONS
- Each row is a citation context.
COLUMN HEADER EXPLANATIONS
1) pmcid - PubMed Central ID of the citing paper
2) pmid - PubMed ID of the citing paper
3) pub_type - Publication type collected from the metadata in the PMCOA XML files.
4) pub_type2 - Specific article types. Please see the manuscript for explanations.
5) year - Publication year of the citing paper
6) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, table_or_figure_caption = tables and table/figure captions)
7) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper.
8) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper.
9) citation - The citation context
10) retracted_yr - Retraction year of the retracted paper
11) cit_purpose - Purpose of citing the retracted paper. This is from human annotations. Please see the manuscript for further information about annotation.
12) longer_context - A extended version of the citation context. (if applicable, otherwise blank) Manually pulled from the full-texts in the process of annotation.
<b>FILENAME: Annotation manual.pdf</b>
- The manual for annotating the citation purposes in column 11) of the 724_knowingly_post_retraction_cit.tsv.
<b>FILENAME: retraction_notice_PMID.csv</b> (new file added for this version)
- A list of 8,346 PMIDs of retraction notices indexed in PubMed (retrieved on August 20, 2020, searched with the query "retraction of publication" [PT] ).
keywords:
citation context; in-text citation; citation to retracted papers; retraction
published:
2021-07-15
Castro, Daniel; Sweedler, Jonathan
(2021)
The dataset contains the high-throughput matrix-assisted laser desorption/ionization mass spectrometry XmL files for the atrial gland and red hemiduct of Aplysia californica.
keywords:
Dense-core vesicle; High-throughput; Mass Spectrometry; MALDI; Organelle; Image-Guided; Atrial gland; red hemiduct; Lucent Vesicle
published:
2024-02-16
Zhang, Mingxiao; Sutton, Bradley
(2024)
Sample data from one typical phantom test and one deidentified shunt patient test (shown in Fig. 8 of the MRM paper), with the corresponding analysis code for the Shunt-FENSI technique.
For the MRM paper “Measuring CSF Shunt Flow with MRI Using Flow Enhancement of Signal Intensity (FENSI)”
keywords:
Shunt-FENSI; MRM; Hydrocephalus; VP Shunt; Flow Quantification; Pediatric Neurosurgery; Pulse Sequence; Signal Simulation
published:
2022-04-20
This is the core data for Zinnen et al., "Functional traits and responses to nutrient and mycorrhizal addition are inconsistently related to wetland plant species’ coefficients of conservatism." This is submitted to Wetlands Ecology and Management.
Two datasets are submitted here. The first is greenhouse-collected data of 9 plant traits and concurrent treatment responses of Illinois wetland plant species. The second are field-collected leaf trait data of Illinois wetland plant species. These data are analyzed in the paper. Please refer to the main manuscript to see how these data were produced and specific analyses.
keywords:
ecological indicators; Floristic Quality Assessment; Floristic Quality Index; wetland degradation
published:
2022-09-08
Hartman, Jordan; Larson, Eric
(2022)
Data associated with the manuscript "Overlooked invaders? Ecological impacts of non-game, native transplant fishes in the United States" by Jordan H. Hartman and Eric R. Larson
keywords:
freshwater; non-game; native transplant; impacts; invasive species
published:
2022-10-27
Holiman, Haley; Kitaif, J. Carson; Fournier, Auriel M.V.; Iglay, Ray; Woodrey, Mark S.
(2022)
keywords:
marsh birds; automated recording units
published:
2023-10-22
Davidson, Ruth; Vachaspati, Pranjal; Mirarab, Siavash; Warnow, Tandy
(2023)
HGT+ILS datasets from Davidson, R., Vachaspati, P., Mirarab, S., & Warnow, T. (2015). Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer. BMC genomics, 16(10), 1-12. Contains model species trees, true and estimated gene trees, and simulated alignments.
keywords:
evolution; computational biology; bioinformatics; phylogenetics
published:
2021-11-05
Keralis, Spencer D. C.; Yakin, Syamil
(2021)
This data set contains survey results from a 2021 survey of University of Illinois University Library employees conducted as part of the Becoming A Trans Inclusive Library Project to evaluate the awareness of University of Illinois faculty, staff, and student employees regarding transgender identities, and to assess the professional development needs of library employees to better serve trans and gender non-conforming patrons. The survey instrument is available in the IDEALS repository: http://hdl.handle.net/2142/110080.
keywords:
transgender awareness, academic library, gender identity awareness, professional development opportunities
published:
2021-09-17
Stern, Jessica; Herman, Brook D. ; Matthews, Jeffrey
(2021)
We studied vegetation metric robustness to environmental (season, interannual, and regional) and methodological (observer) variables, as well as adequate sample size for vegetation metrics across four regions of the United States.
keywords:
coefficients of conservatism; floristic quality assessment; restoration; vegetation metric;
published:
2022-03-31
Crawford, Reed D.; Dodd, Luke E.; Tillman, Frank E.; O'Keefe, Joy M.
(2022)
This dataset contains our bi-hourly temperature recordings from 40 rocket box style artificial roosts of 5 designs deployed in Indiana and Kentucky, USA from April through September 2019. This dataset also includes our endothermic and faculatively heterothermic daily energy expenditure datasets used in our bioenergetic analysis, which were calculated from the bi-hourly rocket box temperature data. Lastly, we include our overheating counts dataset which summarizes daily overheating events (i.e., temperatures > 40 Celsius) in each rocket box style bat box over the course of the study period, these daily summaries were also calculated from the bi-hourly rocket box temperature recordings.
keywords:
artificial roost; bat box; microcllimate; temperature
published:
2024-01-01
Christensen, Jacob; Bettler, Simon; Qu, Kejian; Huang, Jeffrey; Kim, Soyeun; Lu, Yinchuan; Zhao, Chengxi; Chen, Jin; Krogstad, Matthew; Woods, Toby; Mahmood, Fahad; Huang, Pinshane; Abbamonte, Peter; Shoemaker, Daniel
(2024)
Contains scattering data obtained for (TaSe4)2I at the Advanced Photon Source at Argonne National Laboratory. Beamline 6ID-D was used with a beam energy of 64.8 keV in a transmission geometry. Data was obtained at temperatures between 28 and 300 K. See the readme.txt file for more information.
keywords:
X-ray diffraction
published:
2025-11-06
Salmonella HilD 3'UTR GRIL-seq sequencing data
keywords:
Salmonella; SPI1; hilD
published:
2023-04-12
Han, Edmund; Nahid, Shahriar Muhammad; Rakib, Tawfiqur; Nolan, Gillian; F. Ferrari, Paolo; Hossain, M. Abir ; Schleife, André ; Nam, SungWoo; Ertekin, Elif; van der Zande, Arend; Huang, Pinshane
(2023)
STEM images of kinks in α-In2Se3, DFT calculation of bending of α-In2Se3, PFM on as exfoliated and controllably bend α-In2Se3