Illinois Data Bank
Welcome
Log in
Deposit Dataset
Find Data
Policies
Guides
Contact Us
Displaying 326 - 350 of 739 in total
<
1
2
…
10
11
12
13
14
15
16
17
18
…
29
30
>
25 per page
50 per page
Show All
Go
Clear Filters
Generate Report from Search Results
Subject Area
Life Sciences (436)
Physical Sciences (117)
Social Sciences (113)
Technology and Engineering (66)
Uncategorized
Arts and Humanities (1)
Funder
Other (217)
U.S. National Science Foundation (NSF) (202)
U.S. Department of Energy (DOE) (154)
U.S. National Institutes of Health (NIH) (66)
U.S. Department of Agriculture (USDA) (49)
Illinois Department of Natural Resources (IDNR) (14)
U.S. Geological Survey (USGS) (6)
Illinois Department of Transportation (IDOT) (4)
U.S. National Aeronautics and Space Administration (NASA) (4)
U.S. Army (2)
Publication Year
2025 (193)
2024 (96)
2021 (81)
2022 (80)
2020 (72)
2023 (68)
2019 (60)
2018 (47)
2017 (23)
2016 (14)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)
License
CC BY (370)
CC0 (348)
custom (21)
Illinois Data Bank Dataset Search Results
Dataset Search Results
published: 2017-02-28
Freedman, Ryan (2017): Smartphone recorded driving sensor data: Leesburg, VA to Indianapolis, IN. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5975383_V1
Leesburg, VA to Indianapolis, Indiana: Sampling Rate: 0.1 Hz Total Travel Time: 31100007 ms or 518 minutes or 8.6 hours Distance Traveled: 570 miles via I-70 Number of Data Points: 3112 Device used: Samsung Galaxy S4 Date Recorded: 2017-01-15 Parameters Recorded: * ACCELEROMETER X (m/s²) * ACCELEROMETER Y (m/s²) * ACCELEROMETER Z (m/s²) * GRAVITY X (m/s²) * GRAVITY Y (m/s²) * GRAVITY Z (m/s²) * LINEAR ACCELERATION X (m/s²) * LINEAR ACCELERATION Y (m/s²) * LINEAR ACCELERATION Z (m/s²) * GYROSCOPE X (rad/s) * GYROSCOPE Y (rad/s) * GYROSCOPE Z (rad/s) * LIGHT (lux) * MAGNETIC FIELD X (microT) * MAGNETIC FIELD Y (microT) * MAGNETIC FIELD Z (microT) * ORIENTATION Z (azimuth °) * ORIENTATION X (pitch °) * ORIENTATION Y (roll °) * PROXIMITY (i) * ATMOSPHERIC PRESSURE (hPa) * Relative Humidity (%) * Temperature (F) * SOUND LEVEL (dB) * LOCATION Latitude * LOCATION Longitude * LOCATION Altitude (m) * LOCATION Altitude-google (m) * LOCATION Altitude-atmospheric pressure (m) * LOCATION Speed (kph) * LOCATION Accuracy (m) * LOCATION ORIENTATION (°) * Satellites in range * GPS NMEA * Time since start in ms * Current time in YYYY-MO-DD HH-MI-SS_SSS format Quality Notes: There are some things to note about the quality of this data set that you may want to consider while doing preprocessing. This dataset was taken continuously but had multiple stops to refuel (without the data recording ceasing). This can be removed by parsing out all data that has a speed of 0. The mount for this dataset was fairly stable (as can be seen by the consistent orientation angle throughout the dataset). It was mounted tightly between two seats in the back of the vehicle. Unfortunately, the frequency for this dataset was set fairly low at one per ten seconds.
keywords:
smartphone; sensor; driving; accelerometer; gyroscope; magnetometer; gps; nmea; barometer; satellite; temperature; humidity
published: 2024-12-17
Nesbitt, Stephen; Niescier, Robert (2024): Parsivel Data from University of Illinois System for Characterizing and Measuring Precipitation for Hutson et al. (2025). University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-0704763_V1
This repository contains precipitation spectra from a Parsivel-2 disdrometer deployed at Lancaster High School, Lancaster, NY, as well as a MRR-2 radar deployed at the same site. The site was located at 42.9299° N, 78.6708° W. Parsivel data were converted to netCDF using the pyDSD python package. MRR-2 spectra are raw from the manufacturer's software. The Parsivel and MRR-2 data include periods collected during November 2022 as described in the paper.
keywords:
snowfall; disdrometer; spectra; micro rain radar; Doppler
published: 2024-12-20
Stuchiner, Emily; Xu, Jiacheng; Eddy, William C.; DeLucia, Evan H.; Yang, Wendy (2024): Data for Hot or not? An evaluation of methods for identifying hot moments of nitrous oxide emissions from soils. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-8414089_V1
All data presented in the manuscript published in the Journal of Geophysical Research-Biogeosciences by Stuchiner et al. 2025, "Hot or not? An evaluation of methods for identifying hot moments of nitrous oxide emissions from soils." This includes hourly N2O flux measurements from 20 autochambers from May 2022 to April 2023 in a maize field in central Illinois, and various metrics used to assess hot moments that are evaluated in the manuscript. Note that chamber 5 for each sampling node is sampled from a deep soil collar (50 cm depth) that excludes roots for the purpose of measuring heterotrophic respiration rates.
keywords:
nitrous oxide; maize; hot moments; outlier detection; soil emissions
published: 2024-09-26
Kamara, Shasta; Hay, Allison; Oller, Reagan; Suski, Cory (2024): Examining the consequences of angling tournament culling practices on Largemouth Bass Micropterus nigricans. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-9108676_V1
This dataset is from a study of a simulated angling tournament livewell holding in June of 2023 on Largemouth Bass (Micropterus nigricans) on Clinton Lake, Illinois. Fish were collected via electrofishing, weighed, measured and assessed for physical injury prior to receiving a commercially available cull tag and being placed in a simulated livewell. After a six hour livewell holding period, fish were removed from the livewell assessed for physical injury and then assessed for reflex action mortality predictors prior to being placed in a net pen for 3 days of observation. This dataset includes, weights, total lengths, physical injury scores, and reflex action mortality predictor scores for Largemouth Bass and water quality parameters of livewells and the lake in net pens.
keywords:
sport fish conservation; fisheries management; high-grading; stringer
published: 2024-12-12
Varela, Sebastian; Leakey, Andrew (2024): Dataset: Breaking the barrier of human-annotated training data for machine-learning-aided plant research using aerial imagery . University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-8462244_V2
This dataset supports the implementation described in the manuscript "Breaking the Barrier of Human-Annotated Training Data for Machine-Learning-Aided Biological Research Using Aerial Imagery." It comprises UAV aerial imagery used to execute the code available at https://github.com/pixelvar79/GAN-Flowering-Detection-paper. For detailed information on dataset usage and instructions for implementing the code to reproduce the study, please refer to the GitHub repository.
keywords:
Plant phenotyping; generative and adversarial learning; phenotyping; UAV; UAS, drone
published: 2024-12-11
Cheng, Ho Kei (2024): Pretrained models for MMAudio. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-1292191_V2
MMAudio pretrained models. These models can be used in the open-sourced codebase https://github.com/hkchengrex/MMAudio <b>Note:</b> mmaudio_large_44k_v2.pth and Readme.txt are added to this V2. Other 4 files stay the same.
published: 2024-12-05
Salami, Malik Oyewale; McCumber, Corinne (2024): Redacted Dataset for Analyzing the Consistency of Retraction Indexing. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-8114408_V1
This project investigates retraction indexing agreement among data sources: BCI, BIOABS, CCC, Compendex, Crossref, GEOBASE, MEDLINE, PubMed, Retraction Watch, Scopus, and Web of Science Core. Post-retraction citation may be partly due to authors’ and publishers' challenges in systematically identifying retracted publications. To investigate retraction indexing quality, we investigate the agreement in indexing retracted publications between 11 database sources, restricting to their coverage, resulting in a union list of 85,392 unique items. We also discuss common errors in indexing retracted publications. Our results reveal low retraction indexing agreement scores, indicating that databases widely disagree on indexing retracted publications they cover, leading to a lack of consistency in what publications are identified as retracted. Our findings highlight the need for clear and standard practices in the curation and management of retracted publications. Pipeline code to get the result files can be found in the GitHub repository https://github.com/infoqualitylab/retraction-indexing-agreement in the ‘src’ file containing iPython notebooks: The ‘unionlist_completed-ria_2024-07-09.csv’ file has been redacted to remove proprietary data, as noted below in README.txt. Among our sources, data is openly available only for Crossref, PubMed, and Retraction Watch. FILE FORMATS: 1) unionlist_completed-ria_2024-07-09.csv - UTF-8 CSV file 2) README.txt - text file
keywords:
retraction status; data quality; indexing; retraction indexing; metadata; meta-science; RISRS
published: 2024-12-01
Bishop, Rebecca C.; Kemper, Ann M.; Clark, Lindsay V.; Wilkins, Pamela A.; McCoy, Annette M. (2024): Stability of gastric fluid and fecal microbial populations in healthy horses under pasture and stable conditions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7053728_V1
Healthy mares were kept at pasture for 3 weeks, stabled for 5 weeks, returned to pasture and an final sample collected 6 weeks later. Samples were collected weekly: gastric fluid by double-tube nasogastric intubation and aspiration, feces by rectal palpation. Microbial DNA was isolated using the QIAamp PowerFecal Pro DNA kit. Full length 16S, ITS and partial 23S rRNA gene libraries were created using the Shoreline Complete ID kit.
published: 2024-11-27
Han, Hee-Sun; Schrader, Alex; Lee, JuYeon; Yeo, Seokjin; Traniello, Ian (2024): Honey bee MERFISH data for SpaceExpress paper. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-5536668_V1
Honey bee (apis mellifera) MERFISH data set prepared by the Han lab, from brains collected by the Robinson lab at UIUC. Dataset is comprised of ~22 thousand cells and 130 genes with x,y locations for each cell. Jupyter notebook file is included as an example to load the data using Scanpy.
keywords:
smFISH; single transcript spatial transcriptomics; Honey bee brain; Apis mellifera; MERFISH
published: 2020-09-02
Schneider, Jodi; Ye, Di; Hill, Alison (2020): Second-generation citation context analysis (2010-2019) to retracted paper Matsuyama 2005. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-3331845_V2
Citation context annotation. This dataset is a second version (V2) and part of the supplemental data for Jodi Schneider, Di Ye, Alison Hill, and Ashley Whitehorn. (2020) "Continued post-retraction citation of a fraudulent clinical trial report, eleven years after it was retracted for falsifying data". Scientometrics. In press, DOI: 10.1007/s11192-020-03631-1 Publications were selected by examining all citations to the retracted paper Matsuyama 2005, and selecting the 35 citing papers, published 2010 to 2019, which do not mention the retraction, but which mention the methods or results of the retracted paper (called "specific" in Ye, Di; Hill, Alison; Whitehorn (Fulton), Ashley; Schneider, Jodi (2020): Citation context annotation for new and newly found citations (2006-2019) to retracted paper Matsuyama 2005. University of Illinois at Urbana-Champaign. <a href="https://doi.org/10.13012/B2IDB-8150563_V1">https://doi.org/10.13012/B2IDB-8150563_V1</a> ). The annotated citations are second-generation citations to the retracted paper Matsuyama 2005 (RETRACTED: Matsuyama W, Mitsuyama H, Watanabe M, Oonakahara KI, Higashimoto I, Osame M, Arimura K. Effects of omega-3 polyunsaturated fatty acids on inflammatory markers in COPD. Chest. 2005 Dec 1;128(6):3817-27.), retracted in 2008 (Retraction in: Chest (2008) 134:4 (893) https://doi.org/10.1016/S0012-3692(08)60339-6). <b>OVERALL DATA for VERSION 2 (V2)</b> FILES/FILE FORMATS Same data in two formats: 2010-2019 SG to specific not mentioned FG.csv - Unicode CSV (preservation format only) - same as in V1 2010-2019 SG to specific not mentioned FG.xlsx - Excel workbook (preferred format) - same as in V1 Additional files in V2: 2G-possible-misinformation-analyzed.csv - Unicode CSV (preservation format only) 2G-possible-misinformation-analyzed.xlsx - Excel workbook (preferred format) <b>ABBREVIATIONS: </b> 2G - Refers to the second-generation of Matsuyama FG - Refers to the direct citation of Matsuyama (the one the second-generation item cites) <b>COLUMN HEADER EXPLANATIONS </b> File name: 2G-possible-misinformation-analyzed. Other column headers in this file have same meaning as explained in V1. The following are additional header explanations: Quote Number - The order of the quote (citation context citing the first generation article given in "FG in bibliography") in the second generation article (given in "2G article") Quote - The text of the quote (citation context citing the first generation article given in "FG in bibliography") in the second generation article (given in "2G article") Translated Quote - English translation of "Quote", automatically translation from Google Scholar Seriousness/Risk - Our assessment of the risk of misinformation and its seriousness 2G topic - Our assessment of the topic of the cited article (the second generation article given in "2G article") 2G section - The section of the citing article (the second generation article given in "2G article") in which the cited article(the first generation article given in "FG in bibliography") was found FG in bib type - The type of article (e.g., review article), referring to the cited article (the first generation article given in "FG in bibliography") FG in bib topic - Our assessment of the topic of the cited article (the first generation article given in "FG in bibliography") FG in bib section - The section of the cited article (the first generation article given in "FG in bibliography") in which the Matsuyama retracted paper was cited
keywords:
citation context annotation; retraction; diffusion of retraction; second-generation citation context analysis
published: 2024-10-18
Jog, Suneeti (2024): Data for The "full species list" fallacy in Floristic Quality Assessment. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-2781793_V1
Exhaustive species inventory of suburban wetland complex in northeast Ohio (Cuyahoga County).
keywords:
floristic survey; wetland complex; comprehensive species list
published: 2024-10-16
Smith, Rebecca; Huang, Conghui (2024): Data for A modeling study on SARS-CoV-2 transmission in primary and middle schools in Illinois. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-3705306_V1
School testing data were provided by Shield Illinois (ShieldIL), which conducted weekly in-school testing on behalf of the Illinois Department of Public Health (IDPH) for all participating schools in the state excluding Chicago Public Schools. The populations and proportions of students and employees in the studied school districts are reported by Elementary/Secondary Information System (ElSi) database.
keywords:
COVID-19; school testing
published: 2023-07-05
Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal; Lischwe Mueller, Natalie (2023): The Salt Controversy Systematic Review Reports and Primary Study Reports Network Dataset . University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-6128763_V3
The salt controversy is the public health debate about whether a population-level salt reduction is beneficial. This dataset covers 82 publications--14 systematic review reports (SRRs) and 68 primary study reports (PSRs)--addressing the effect of sodium intake on cerebrocardiovascular disease or mortality. These present a snapshot of the status of the salt controversy as of September 2014 according to previous work by epidemiologists: The reports and their opinion classification (for, against, and inconclusive) were from Trinquart et al. (2016) (Trinquart, L., Johns, D. M., & Galea, S. (2016). Why do we think we know what we know? A metaknowledge analysis of the salt controversy. International Journal of Epidemiology, 45(1), 251–260. https://doi.org/10.1093/ije/dyv184 ), which collected 68 PSRs, 14 SRRs, 11 clinical guideline reports, and 176 comments, letters, or narrative reviews. Note that our dataset covers only the 68 PSRs and 14 SRRs from Trinquart et al. 2016, not the other types of publications, and it adds additional information noted below. This dataset can be used to construct the inclusion network and the co-author network of the 14 SRRs and 68 PSRs. A PSR is "included" in an SRR if it is considered in the SRR's evidence synthesis. Each included PSR is cited in the SRR, but not all references cited in an SRR are included in the evidence synthesis or PSRs. Based on which PSRs are included in which SRRs, we can construct the inclusion network. The inclusion network is a bipartite network with two types of nodes: one type represents SRRs, and the other represents PSRs. In an inclusion network, if an SRR includes a PSR, there is a directed edge from the SRR to the PSR. The attribute file (report_list.csv) includes attributes of the 82 reports, and the edge list file (inclusion_net_edges.csv) contains the edge list of the inclusion network. Notably, 11 PSRs have never been included in any SRR in the dataset. They are unused PSRs. If visualized with the inclusion network, they will appear as isolated nodes. We used a custom-made workflow (Fu, Y. (2022). Scopus author info tool (1.0.1) [Python]. https://github.com/infoqualitylab/Scopus_author_info_collection ) that uses the Scopus API and manual work to extract and disambiguate authorship information for the 82 reports. The author information file (salt_cont_author.csv) is the product of this workflow and can be used to compute the co-author network of the 82 reports. We also provide several other files in this dataset. We collected inclusion criteria (the criteria that make a PSR eligible to be included in an SRR) and recorded them in the file systematic_review_inclusion_criteria.csv. We provide a file (potential_inclusion_link.csv) recording whether a given PSR had been published as of the search date of a given SRR, which makes the PSR potentially eligible for inclusion in the SRR. We also provide a bibliography of the 82 publications (supplementary_reference_list.pdf). Lastly, we discovered minor discrepancies between the inclusion relationships identified by Trinquart et al. (2016) and by us. Therefore, we prepared an additional edge list (inclusion_net_edges_trinquart.csv) to preserve the inclusion relationships identified by Trinquart et al. (2016). <b>UPDATES IN THIS VERSION COMPARED TO V2</b> (Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal (2022): The Salt Controversy Systematic Review Reports and Primary Study Reports Network Dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6128763_V2) - We added a new column "pub_date" to report_list.csv - We corrected mistakes in supplementary_reference_list.pdf for report #28 and report #80. The author of report #28 is not Salisbury D but Khaw, K.-T., & Barrett-Connor, E. Report #80 was mistakenly mixed up with report #81.
keywords:
systematic reviews; evidence synthesis; network analysis; public health; salt controversy;
published: 2023-09-21
Clarke, Caitlin; Lischwe Mueller, Natalie; Joshi, Manasi Ballal; Fu, Yuanxi; Schneider, Jodi (2023): The Inclusion Network of 27 Review Articles Published between 2013-2018 Investigating the Relationship Between Physical Activity and Depressive Symptoms. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-4614455_V4
The relationship between physical activity and mental health, especially depression, is one of the most studied topics in the field of exercise science and kinesiology. Although there is strong consensus that regular physical activity improves mental health and reduces depressive symptoms, some debate the mechanisms involved in this relationship as well as the limitations and definitions used in such studies. Meta-analyses and systematic reviews continue to examine the strength of the association between physical activity and depressive symptoms for the purpose of improving exercise prescription as treatment or combined treatment for depression. This dataset covers 27 review articles (either systematic review, meta-analysis, or both) and 365 primary study articles addressing the relationship between physical activity and depressive symptoms. Primary study articles are manually extracted from the review articles. We used a custom-made workflow (Fu, Yuanxi. (2022). Scopus author info tool (1.0.1) [Python]. <a href="https://github.com/infoqualitylab/Scopus_author_info_collection">https://github.com/infoqualitylab/Scopus_author_info_collection</a> that uses the Scopus API and manual work to extract and disambiguate authorship information for the 392 reports. The author information file (author_list.csv) is the product of this workflow and can be used to compute the co-author network of the 392 articles. This dataset can be used to construct the inclusion network and the co-author network of the 27 review articles and 365 primary study articles. A primary study article is "included" in a review article if it is considered in the review article's evidence synthesis. Each included primary study article is cited in the review article, but not all references cited in a review article are included in the evidence synthesis or primary study articles. The inclusion network is a bipartite network with two types of nodes: one type represents review articles, and the other represents primary study articles. In an inclusion network, if a review article includes a primary study article, there is a directed edge from the review article node to the primary study article node. The attribute file (article_list.csv) includes attributes of the 392 articles, and the edge list file (inclusion_net_edges.csv) contains the edge list of the inclusion network. Collectively, this dataset reflects the evidence production and use patterns within the exercise science and kinesiology scientific community, investigating the relationship between physical activity and depressive symptoms. FILE FORMATS 1. article_list.csv - Unicode CSV 2. author_list.csv - Unicode CSV 3. Chinese_author_name_reference.csv - Unicode CSV 4. inclusion_net_edges.csv - Unicode CSV 5. review_article_details.csv - Unicode CSV 6. supplementary_reference_list.pdf - PDF 7. README.txt - text file 8. systematic_review_inclusion_criteria.csv - Unicode CSV <b>UPDATES IN THIS VERSION COMPARED TO V3</b> (Clarke, Caitlin; Lischwe Mueller, Natalie; Joshi, Manasi Ballal; Fu, Yuanxi; Schneider, Jodi (2023): The Inclusion Network of 27 Review Articles Published between 2013-2018 Investigating the Relationship Between Physical Activity and Depressive Symptoms. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4614455_V3) - We added a new file systematic_review_inclusion_criteria.csv.
keywords:
systematic reviews; meta-analyses; evidence synthesis; network visualization; tertiary studies; physical activity; depressive symptoms; exercise; review articles
published: 2024-11-01
Zhang, Ziliang; Eddy, William C.; Stuchiner, Emily R.; DeLucia, Evan H.; Yang, Wendy (2024): Data for A conceptual model explaining spatial variation in soil nitrous oxide emissions in agricultural fields. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-3526278_V1
This dataset includes data on soil nitrous oxide fluxes, soil properties, and climate presented in the manuscript, "A conceptual model explaining spatial variation in soil nitrous oxide emissions in agricultural fields," published in Commucations Earth & Environment. Please refer to that publication for details about methodologies used to generate these data and for the experimental design.
keywords:
soil nitrous oxide emissions; gross nitrous oxide production; gross nitrous oxide consumption; N2O; denitrification; maize; cannon model
published: 2024-11-07
Zheng, Heng; Fu, Yuanxi; Vandel, Ellie; Schneider, Jodi (2024): Dataset of 286 publications citing the 2014 Willoughby-Jansma-Hoye protocol. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-4610831_V3
This dataset consists of the 286 publications retrieved from Web of Science and Scopus on July 6, 2023 as citations for Willoughby et al., 2014: Patrick H. Willoughby, Matthew J. Jansma, and Thomas R. Hoye (2014). A guide to small-molecule structure assignment through computation of (¹H and ¹³C) NMR chemical shifts. Nature Protocols, 9(3), Article 3. https://doi.org/10.1038/nprot.2014.042 We added the DOIs of the citing publications into a Zotero collection. Then we exported all 286 DOIs in two formats: a .csv file (data export) and an .rtf file (bibliography). <b>Willoughby2014_286citing_publications.csv</b> is a Zotero data export of the citing publications. <b>Willoughby2014_286citing_publications.rtf</b> is a bibliography of the citing publications, using a variation of the American Psychological Association style (7th edition) with full names instead of initials. To create <b>Willoughby2014_citation_contexts.csv</b>, HZ manually extracted the paragraphs that contain a citation marker of Willoughby et al., 2014. We refer to these paragraphs as the citation contexts of Willoughby et al., 2014. Manual extraction started with 286 citing publications but excluded 2 publications that are not in English, those with DOIs 10.13220/j.cnki.jipr.2015.06.004 and 10.19540/j.cnki.cjcmm.20200604.201 The silver standard aimed to triage the citing publications of Willoughby et al., 2014 that are at risk of propagating unreliability due to a code glitch in a computational chemistry protocol introduced in Willoughby et al., 2014. The silver standard was created stepwise: First one chemistry expert (YF) manually annotated the corpus of 284 citing publications in English, using their full text and citation contexts. She manually categorized publications as either at risk of propagating unreliability or not at risk of propagating unreliability, with a rationale justifying each category. Then we selected a representative sample of citation contexts to be double annotated. To do this, MJS turned the full dataset of citation contexts (Willoughby2014_citation_contexts.csv) into word embeddings, clustered them using similarity measures using BERTopic's HDBS, and selected representative citation contexts based on the centroids of the clusters. Next the second chemistry expert (EV) annotated the 77 publications associated with the citation contexts, considering the full text as well as the citation contexts. <b>double_annotated_subset_77_before_reconciliation.csv</b> provides EV and YF's annotation before reconciliation. To create the silver standard YF, EV, and JS discussed differences and reconciled most differences. YF and EV had principled reasons for disagreeing on 9 publications; to handle these, YF updated the annotations, to create the silver standard we use for evaluation in the remainder of our JCDL 2024 paper (<b>silver_standard.csv</b>) <b>Inter_Annotator_Agreement.xlsx</b> indicates publications where the two annotators made opposite decisions and calculates the inter-annotator agreement before and after reconciliation together. <b>double_annotated_subset_77_before_reconciliation.csv</b> provides EV and YF's annotation after reconciliation, including applying the reconciliation policy.
keywords:
unreliable cited sources; knowledge maintenance; citations; scientific digital libraries; scholarly publications; reproducibility; unreliability propagation; citation contexts
published: 2024-07-08
Chong, Jer Pin; Minnaert-Grote, Jamie; Zaya, David N.; Ashley, Mary V.; Coons, Janice; Ramp Neal, Jennifer M.; Molano-Flores, Brenda (2024): Microsatellite genotypes and locations for three Physaria taxa on and near the Kaibab Plateau, Arizona, USA. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-2540221_V1
A population genetics study was conducted on three plant taxa in the genus Physaria that are found on the Kaibab Plateau (Arizona, USA). Physaria kingii subsp. kaibabensis is endemic to the Kaibab Plateau, and is of conservation concern because of its rarity, limited range, and potential threats to its long-term persistence. Additionally, the taxon is a candidate for federal protection under the Endangered Species Act. It was not clear how genetically isolated P. k. subsp. kaibabensis was from Physaria kingii subsp. latifolia, which is a widespread subspecies found throughout the southwestern USA, including on the Kaibab Plateau. Additionally, other authors have suggested that P. k. subsp. kaibabensis may hybridize with Physaria arizonica, a different species that is also widespread and found on and off the Kaibab Plateau. We conducted a population genetics study of all three groups to better determine the conservation status of P. k. subsp. kaibabensis. Genetic data are in the form of nuclear DNA microsatellites for 13 loci (all apparently diploid). Additionally, we have included location information for the collection sites. We collected tissue samples from on and off the Kaibab Plateau. The overall findings are shared in a manuscript being submitted for peer-review.
keywords:
Physaria kingii; Kaibab Plateau; endemism; conservation genetics; rare species biology
published: 2024-11-15
Cheng, Ho Kei (2024): BL30K. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-1702934_V1
BL30K is a synthetic dataset rendered using Blender with ShapeNet's data. We break the dataset into six segments, each with approximately 5K videos. The videos are organized in a similar format as DAVIS and YouTubeVOS, so dataloaders for those datasets can be used directly. Each video is 160 frames long, and each frame has a resolution of 768*512. There are 3-5 objects per video, and each object has a random smooth trajectory -- we tried to optimize the trajectories in a greedy fashion to minimize object intersection (not guaranteed), with occlusions still possible (happen a lot in reality). See [Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion (MiVOS), CVPR 2022] for details.
published: 2024-11-14
Matthews, Jeffrey W.; Huang, Annie H. (2024): Data for The invasion of Japanese hop (Humulus japonicus) in a restored floodplain forest. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-6760644_V1
These data represent the raw data from the paper “The invasion of Japanese hop (Humulus japonicus) in a restored floodplain forest” published in Invasive Plant Science and Management by Annie H. Huang and Jeffrey W. Matthews.
keywords:
invasive plants; restored wetlands
published: 2023-07-06
Schneider, Amy; Suski, Cory; Esbaugh, Andrew (2023): Dataset for Silver carp experience metabolic and behavioral changes when exposed to water from the Chicago Area Waterway; implications for upstream movement. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0037727_V1
published: 2021-08-28
Southey, Bruce; Rodriguez-Zas, Sandra (2021): Metabolics of weaning and maternal immune activation in 22 day old pigs. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9043394_V1
Metabolite identifications and profiles of liver samples from 22 day old male and female pigs from gilt that exposed to porcine reproductive and respiratory syndrome virus (P) or not (C) that were weaned at 21 days of age (W) or not (N). Profiles were obtained by University of Illinois Carver Metabolomics Center. Spectrum for each sample was acquired using a gas chromatography mass spectrometry system consisting of an Agilent 7890 gas chromatograph, an Agilent 5975 MSD, and an HP 7683B auto sampler.
keywords:
gas chromatography; mass spectrometry; maternal immune activation; weaning; liver
published: 2024-08-02
Morrow Plots Data Curation Working Group (2024): Morrow Plots Treatment and Yield Data. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7865141_V2
The Morrow Plots at the University of Illinois at Urbana-Champaign are the longest-running continuous experimental plots in the Americas. In continuous operation since 1876, the plots were established to explore the impact of crop rotation and soil treatment on corn crop yields. In 2018, The Morrow Plots Data Curation Working Group began to identify, collect and curate the various data records created over the history of the experiment. The resulting data table published here includes planting, treatment and yield data for the Morrow Plots since 1888. Please see the included codebook for a detailed explanation of the data sources and their content. This dataset will be updated as new yield data becomes available. *NOTE: While digitized and accessed through IDEALS, the physical copy of the field notebook: <a href="https://archon.library.illinois.edu/archives/index.php?p=collections/controlcard&id=11846">Morrow Plots Notebook, 1876-1913, 1967</a> is also held at the University of Illinois Archives.
keywords:
Corn; Crop Science; Experimental Fields; Crop Yields; Agriculture; Illinois; Morrow Plots
published: 2024-10-11
Zinnen, Jack; Barak, Rebecca; Matthews, Jeffrey (2024): Data for Influence of ecological characteristics and phylogeny on native plant species’ commercial availability. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1143125_V1
This is the core data for Influence of ecological characteristics and phylogeny on native plant species’ commercial availability, a manuscript pending publication in Ecological Applications. The data regard ecological characteristics, phenology, and phylogeny of plant species native to the Midwestern United States and how those factors relate to commercial availability.
keywords:
biodiversity; native plant nursery; plant trade; plant vendors; restoration
published: 2024-10-08
Mersich, Ina; Bishop, Rebecca; Diaz Yucupicio, Sandra; Nobrega, Ana D.; Austin, Scott; Barger, Anne; Fick , Megan E.; Wilkins, Pamela (2024): Data for Decreased Circulating Red Cell Mass (Packed Cell Volume) Alters Viscoelastic and Traditional Plasma Coagulation Testing Results in Healthy Horses. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9153919_V1
Acepromazine was administered to healthy adult horses to induce transient anemia secondary to splenic sequestration. Data was collected at baseline (T0), 1 hour (T1) and 12 hours (T2) post acepromazine administration. Data collection included PCV, TP, CBC, fibrinogen, PT, PTT and viscoelastic coagulation profiles (VCM Vet) as well as ultrasonographic measurements of the spleen at all 3 time points.
keywords:
horse; coagulation; viscoelastic testing; anemia; acepromazine
published: 2024-10-07
Kole Aspray, Elise; Ainsworth, Elizabeth; McGrath, Jesse; McGrath, Justin; Montes, Christopher; Whetten, Andrew; Ort, Donald; Long, Stephen; Puthuval, Kannan; Mies, Timothy; Bernacchi, Carl; DeLucia, Evan; Dalsing, Bradley; Leakey, Andrew; Li, Shuai; Herriott, Jelena; Miglietta, Franco (2024): SoyFACE Fumigation Data Files. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3496460_V4
This data set is related to the SoyFACE experiments, which are open-air agricultural climate change experiments that have been conducted since 2001. The fumigation experiments take place at the SoyFACE farm and facility in Champaign County, Illinois during the growing season of each year, typically between June and October. This V4 contains new experimental data files, hourly fumigation files, and weather/ambient files for 2022 and 2023, since the original dataset only included files for 2001-2021. The MATLAB code has also been updated for efficiency, and explanatory files have been updated accordingly. Below are new changes in V4: - The "SoyFACE Plot Information 2001 to 2021" file is renamed to “SoyFACE ring information 2001 to 2023.xlsx”. Data for 2022 and 2023 were added. File contains information about each year of the SoyFACE experiments, including the fumigation treatment type (CO2, O3, or a combination treatment), the crop species, the plots (also referred to as 'rings' and labeled with numbers between 2 and 31) used in each experiment, important experiment dates, and the target concentration levels or 'setpoints' for CO2 and O3 in each experiment. - The "SoyFACE 1-Minute Fumigation Data Files" were updated to contain sub-folders for each year of the experiments (2001-2023), each of which contains sub-folders for each ring used in that year's experiments. This data set also includes hourly data files for the fumigation experiments ("SoyFACE Hourly Fumigation Data Files" folder) created from the 1-minute files, and hourly ambient/weather data files for each year of the experiments ("Hourly Weather and Ambient Data Files" folder which has also been updated to include 2022 and 2023 data). The ambient CO2 and O3 data are collected at SoyFACE, and the weather data are collected from the SURFRAD and WARM weather stations located near the SoyFACE farm. - “Rings.xlsx” is new in this version. This file lists the rings and treatments used in each year of the SoyFACE experiments between 2001 and 2023 and is used in several of the MATLAB codes. - “CMI Weather Data Explanation.docx” is newly added. This file contains specific information about the processing of raw weather data, which is used in the hourly weather and ambient data files. - Files that were in RAR format in V3 are now updated and saved as ZIP format, including: Hourly Weather and Ambient Data Files.zip , SoyFACE 1-Minute Fumigation Data Files.zip , SoyFACE Hourly Fumigation Data Files.zip, and Matlab Files.zip. - The "Fumigation Target Percentages" file was updated to add data for 2022 and 2023. This file shows how much of the time the CO2 and O3 fumigation levels are within a 10 or 20 percent margin of the target levels when the fumigation system is turned on. - The "Matlab Files" folder contains custom code (Aspray, E.K.) that was used to clean the "SoyFACE 1-Minute Fumigation Data" files and to generate the "SoyFACE Hourly Fumigation Data" and "Fumigation Target Percentages" files. Code information can be found in the various "Explanation" files. The Matlab code changes are as follows: 1. “Data_Issues_Finder.m” code was changed to use the “Ring.xlsx” file to gather ring and treatment information based on the contents of the file rather than being hardcoded in the Matlab code itself. 2. “Data_Issues_Finder_all.m” code is new. This code is the same as the “Data_Issues_Finder.m” code except that it identifies all CO2 and O3 repeats. In contrast, the “Data_Issues_Finder.m” code only identifies CO2 and O3 repeats that occur when the fumigation system is turned on. 3. “Target_Yearly.m” code was changed to use the “Ring.xlsx” file to gather ring and treatment information based on the contents of the file rather than being hardcoded in the Matlab code itself. 4. “HourlyFumCode.m” code is new. This code uses the “Rings.xlsx” file to gather ring and treatment information based on the contents of the file instead of the user needing to define these values explicitly. This code also defines a list of all ring folders for the year selected and runs the hourly code for each ring, instead of the user having to run the hourly code for each ring individually. Finally, the code generates two dialog boxes for the user, one which allows user to specify whether they want the hourly code to be run for 1-minute fumigation files or 1-minute ambient files, and another which allows user to specify whether they would like the hourly fumigation averages to be replaced with hourly ambient averages when the fumigation system is turned off. 5. “HourlyDataFun.m” code was changed to run either “HourlyData.m” code or “HourlyDataAmb.m” code, depending on user input in the first dialog box. 6. “HourlyData.m” code was changed to replace hourly fumigation averages with hourly ambient averages when the fumigation system is turned off, depending on user input in the second dialog box. 7. “HourlyDataAmb.m” code is new. This code is similar to “HourlyData.m” code but is used to calculate hourly averages for 1-minute ambient files instead 1-minute fumigation files. 8. “batch.m” code was changed to account for new function input variables in “HourlyDataFun.m” code, along with adding header columns for “FumOutput.xlsx” and “AmbOutput.xlsx” output files generated by “HourlyData.m” and “HourlyDataAmb.m” code. - Finally, the " * Explanation" files contain information about the column names, units of measurement, steps needed to use Matlab code, and other pertinent information for each data file. Some of them have been updated to reflect the current change of data.
keywords:
SoyFACE; agriculture; agricultural; climate; climate change; atmosphere; atmospheric change; CO2; carbon dioxide; O3; ozone; soybean; fumigation; treatment
Research Data Service
Illinois Data Bank
Access and Use Policies
Web Privacy Notice
Contact Us