Displaying datasets 76 - 100 of 447 in total

Subject Area

Life Sciences (243)
Social Sciences (99)
Physical Sciences (61)
Technology and Engineering (34)
Uncategorized (9)
Arts and Humanities (1)


Other (124)
U.S. National Science Foundation (NSF) (121)
U.S. National Institutes of Health (NIH) (45)
U.S. Department of Energy (DOE) (41)
U.S. Department of Agriculture (USDA) (23)
Illinois Department of Natural Resources (IDNR) (9)
U.S. National Aeronautics and Space Administration (NASA) (5)
U.S. Geological Survey (USGS) (4)
Illinois Department of Transportation (IDOT) (1)
U.S. Army (1)

Publication Year

2021 (109)
2020 (96)
2019 (72)
2018 (59)
2022 (45)
2017 (35)
2016 (30)
2023 (1)


CC0 (258)
CC BY (179)
custom (10)
published: 2021-08-15
This data set contains mass spectrometry data used for the publication "mspack: efficient lossless and lossy mass spectrometry data compression".
keywords: mass-spectrometry data; compression; proteomics
published: 2021-08-14
1. Rice H2 - Destructive Harvest - These data are for the destructive harvest (above-ground biomass) of 30 diverse indica rice genotypes that were grown to evaluate natural variation as well as the heritability of photosynthesis-related traits. Traits measured include: plant height, leaf area, plant fresh and dry weights, and tiller number. 2. Rice H2 - ACi Response Summary - These data characterize the response of CO2 uptake to change in intercellular CO2 concentration in 30 diverse indica rice genotypes. These measurements were taken to evaluate natural variation and the heritability of photosynthesis-related traits in rice. 3. Rice H2 - Survey Style Gas Exchange Measurements - These data document steady-state survey style gas exchange measurements in 30 diverse indica rice genotypes. These measurements were taken to evaluate natural variation and the heritability of photosynthesis-related traits in rice.
keywords: photosynthesis, photosynthetic capacity, natural variation, heritability, food security, rice
published: 2021-08-12
This dataset contains the images of a photoperiod sensitive sorghum accession population used for a GWAS/TWAS study of leaf traits related to water use efficiency in 2016 and 2017. *<b>Note:</b> new in this second version is that JPG images outputted from the nms files were added <b>Accessions_2016.zip</b> and <b>Accessions_2017.zip</b>: contain raw images produced by Optical Topometer (nms files) for all sorghum accessions. Images can be opened with Nanofocus μsurf analysis extended software (Oberhausen,Germany). <b>Accessions_2016_jpg.zip</b> and <b>Accessions_2017_jpg.zip</b>: contain jpg images outputted from the nms files and used in the machine learning phenotyping.
keywords: stomata; segmentation; water use efficiency
published: 2021-08-05
This geodatabase serves two purposes: 1) to provide State of Illinois agencies with a fast resource for the preparation of maps and figures that require the use of shape or line files from federal agencies, the State of Illinois, or the City of Chicago, and 2) as a start for social scientists interested in exploring how geographic information systems (whether this is data visualization or geographically weighted regression) can bring new meaning to the interpretation of their data. All layer files included are relevant to the State of Illinois. Sources for this geodatabase include the U.S. Census Bureau, U.S. Geological Survey, City of Chicago, Chicago Public Schools, Chicago Transit Authority, Regional Transportation Authority, and Bureau of Transportation Statistics.
keywords: State of Illinois; City of Chicago; Chicago Public Schools; GIS; Statistical tabulation areas; hydrography
published: 2021-08-04
This dataset contains data derived from large-scale particle velocimetry measurements obtained at the confluence of the Saline Branch and an unnamed tributary in Illinois. The data were collected using two cameras positioned about the confluence, one mounted on a cable and the other mounted on a tripod. A description of the content of the files can be found in Description of Files.rtf.
keywords: confluence; hydrodynamics; LSPIV; flow structure; stagnation
published: 2021-07-30
This data comes from a scoping review associated with the project called Reducing the Inadvertent Spread of Retracted Science. The data summarizes the fields that have been explored by existing research on retraction, a list of studies comparing retraction in different fields, and a list of studies focused on retraction of COVID-19 articles.
keywords: retraction; fields; disciplines; research integrity
published: 2021-05-14
Please cite as: Jim Miller, Sergiusz Czesny, Qihong Dai, James Ellis, Louis Iverson, Jeff Matthews, Charles Roswell, Cory Suski, John Taft, and Mike Ward. 2021. “Climate Change Impacts on Ecosystems: Scientific and Common Species Names”.
keywords: Scientific names; Common names; Illinois species
published: 2021-05-09
Raw data and its analysis collected from a trial designed to test the impact of providing a Bacillus-based direct-fed microbial (DFM) on the syndrome resulting from orally infecting pigs with either Salmonella enterica serotype Choleraesuis (S. Choleraesuis) alone, or in combination with an intranasal challenge, three days later, with porcine reproductive and respiratory syndrome virus (PRRSV).
keywords: excel file
published: 2021-06-28
This dataset contains 1) the cleaned version of 11 CRW datasets, 2) RNASim10k dataset in high fragmentation and 3) three CRW datasets (16S.3, 16S.T, 16S.B.ALL) in high fragmentation.
keywords: MAGUS;UPP;Multiple Sequence Alignment;PASTA;eHMMs
published: 2021-05-14
This document contains the Supplemental Materials for Chapter 4: Climate Change Impacts on Agriculture from the report "An Assessment of the Impacts of Climate Change in Illinois" published in 2021.
keywords: Illinois; climate change; agriculture; impacts; adaptation; crop yield; ISAM; econometrics; days suitable for fieldwork
published: 2021-05-14
Supplemental Forest Data for Chapter 6: Climate Change Impacts on Ecosystems in "An Assessment of the Impacts of Climate Change in Illinois"
published: 2021-07-22
This dataset includes five files. Descriptions of the files are given as follows: <b>FILENAME: PubMed_retracted_publication_full_v3.tsv</b> - Bibliographic data of retracted papers indexed in PubMed (retrieved on August 20, 2020, searched with the query "retracted publication" [PT] ). - Except for the information in the "cited_by" column, all the data is from PubMed. - PMIDs in the "cited_by" column that meet either of the two conditions below have been excluded from analyses: [1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file). [2] Citing paper and the cited retracted paper have the same PMID. ROW EXPLANATIONS - Each row is a retracted paper. There are 7,813 retracted papers. COLUMN HEADER EXPLANATIONS 1) PMID - PubMed ID 2) Title - Paper title 3) Authors - Author names 4) Citation - Bibliographic information of the paper 5) First Author - First author's name 6) Journal/Book - Publication name 7) Publication Year 8) Create Date - The date the record was added to the PubMed database 9) PMCID - PubMed Central ID (if applicable, otherwise blank) 10) NIHMS ID - NIH Manuscript Submission ID (if applicable, otherwise blank) 11) DOI - Digital object identifier (if applicable, otherwise blank) 12) retracted_in - Information of retraction notice (given by PubMed) 13) retracted_yr - Retraction year identified from "retracted_in" (if applicable, otherwise blank) 14) cited_by - PMIDs of the citing papers. (if applicable, otherwise blank) Data collected from iCite. 15) retraction_notice_pmid - PMID of the retraction notice (if applicable, otherwise blank) <b>FILENAME: PubMed_retracted_publication_CitCntxt_withYR_v3.tsv</b> - This file contains citation contexts (i.e., citing sentences) where the retracted papers were cited. The citation contexts were identified from the XML version of PubMed Central open access (PMCOA) articles. - This is part of the data from: Hsiao, T.-K., & Torvik, V. I. (manuscript in preparation). Citation contexts identified from PubMed Central open access articles: A resource for text mining and citation analysis. - Citation contexts that meet either of the two conditions below have been excluded from analyses: [1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file). [2] Citing paper and the cited retracted paper have the same PMID. ROW EXPLANATIONS - Each row is a citation context associated with one retracted paper that's cited. - In the manuscript, we count each citation context once, even if it cites multiple retracted papers. COLUMN HEADER EXPLANATIONS 1) pmcid - PubMed Central ID of the citing paper 2) pmid - PubMed ID of the citing paper 3) year - Publication year of the citing paper 4) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, tbl_fig_caption = tables and table/figure captions) 5) IMRaD - IMRaD section of the citation context (I = Introduction, M = Methods, R = Results, D = Discussions/Conclusion, NoIMRaD = not identified) 6) sentence_id - The ID of the citation context in a given location. For location information, please see column 4. The first sentence in the location gets the ID 1, and subsequent sentences are numbered consecutively. 7) total_sentences - Total number of sentences in a given location 8) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper. 9) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper. 10) citation - The citation context 11) progression - Position of a citation context by centile within the citing paper. 12) retracted_yr - Retraction year of the retracted paper 13) post_retraction - 0 = not post-retraction citation; 1 = post-retraction citation. A post-retraction citation is a citation made after the calendar year of retraction. <b>FILENAME: 724_knowingly_post_retraction_cit.csv</b> (updated) - The 724 post-retraction citation contexts that we determined knowingly cited the 7,813 retracted papers in "PubMed_retracted_publication_full_v3.tsv". - Two citation contexts from retraction notices have been excluded from analyses. ROW EXPLANATIONS - Each row is a citation context. COLUMN HEADER EXPLANATIONS 1) pmcid - PubMed Central ID of the citing paper 2) pmid - PubMed ID of the citing paper 3) pub_type - Publication type collected from the metadata in the PMCOA XML files. 4) pub_type2 - Specific article types. Please see the manuscript for explanations. 5) year - Publication year of the citing paper 6) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, table_or_figure_caption = tables and table/figure captions) 7) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper. 8) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper. 9) citation - The citation context 10) retracted_yr - Retraction year of the retracted paper 11) cit_purpose - Purpose of citing the retracted paper. This is from human annotations. Please see the manuscript for further information about annotation. 12) longer_context - A extended version of the citation context. (if applicable, otherwise blank) Manually pulled from the full-texts in the process of annotation. <b>FILENAME: Annotation manual.pdf</b> - The manual for annotating the citation purposes in column 11) of the 724_knowingly_post_retraction_cit.tsv. <b>FILENAME: retraction_notice_PMID.csv</b> (new file added for this version) - A list of 8,346 PMIDs of retraction notices indexed in PubMed (retrieved on August 20, 2020, searched with the query "retraction of publication" [PT] ).
keywords: citation context; in-text citation; citation to retracted papers; retraction
published: 2021-07-21
This dataset contains 1 CSV file: RozanskyLarsonTaylorMsat.csv which contains microsatellite fragment lengths for Virile and Spothanded Crayfish from the Current River watershed of Missouri, U.S., and complimentary data, including assignments to species by phenotype and COI sequence data, GenBank accession numbers for COI sequence data, study sites with dates of collection and geographic coordinates, and Illinois Natural History Survey (INHS) Crustacean Collection lots where specimens are stored.
keywords: invasive species; hybridization; crayfishes; streams; freshwater; Cambaridae; virile crayfish; spothanded crayfish; Missouri; Current River; Ozark National Scenic Riverways
published: 2021-06-25
Data associated with the manuscript "Do rusty crayfish invasions affect water clarity in north temperate lakes?" by Daniel K. Szydlowski, Melissa K. Daniels, and Eric R. lARSON
keywords: chlorophyll a; crayfish; Faxonius rusticus; invasive species; lakes; LandSat; remote sening; rusty crayfish; Secchi disc; water clarity
published: 2021-07-20
This dataset contains data from extreme-disagreement analysis described in paper “Aaron M. Cohen, Jodi Schneider, Yuanxi Fu, Marian S. McDonagh, Prerna Das, Arthur W. Holt, Neil R. Smalheiser, 2021, Fifty Ways to Tag your Pubtypes: Multi-Tagger, a Set of Probabilistic Publication Type and Study Design Taggers to Support Biomedical Indexing and Evidence-Based Medicine.” In this analysis, our team experts carried out an independent formal review and consensus process for extreme disagreements between MEDLINE indexing and model predictive scores. “Extreme disagreements” included two situations: (1) an abstract was MEDLINE indexed as a publication type but received low scores for this publication type, and (2) an abstract received high scores for a publication type but lacked the corresponding MEDLINE index term. “High predictive score” is defined as the top 100 high-scoring, and “low predictive score” is defined as the bottom 100 low-scoring. Three publication types were analyzed, which are CASE_CONTROL_STUDY, COHORT_STUDY, and CROSS_SECTIONAL_STUDY. Results were recorded in three Excel workbooks, named after the publication types: case_control_study.xlsx, cohort_study.xlsx, and cross_sectional_study.xlsx. The analysis shows that, when the tagger gave a high predictive score (>0.9) on articles that lacked a corresponding MEDLINE indexing term, independent review suggested that the model assignment was correct in almost all cases (CROSS_SECTIONAL_STUDY (99%), CASE_CONTROL_STUDY (94.9%), and COHORT STUDY (92.2%)). Conversely, when articles received MEDLINE indexing but model predictive scores were very low (<0.1), independent review suggested that the model assignment was correct in the majority of cases: CASE_CONTROL_STUDY (85.4%), COHORT STUDY (76.3%), and CROSS_SECTIONAL_STUDY (53.6%). Based on the extreme disagreement analysis, we identified a number of false-positives (FPs) and false-negatives (FNs). For case control study, there were 5 FPs and 14 FNs. For cohort study, there were 7 FPs and 22 FNs. For cross-sectional study, there were 1 FP and 45 FNs. We reviewed and grouped them based on patterns noticed, providing clues for further improving the models. This dataset reports the instances of FPs and FNs along with their categorizations.
keywords: biomedical informatics; machine learning; evidence based medicine; text mining
published: 2021-07-15
The dataset contains the high-throughput matrix-assisted laser desorption/ionization mass spectrometry XmL files for the atrial gland and red hemiduct of Aplysia californica.
keywords: Dense-core vesicle; High-throughput; Mass Spectrometry; MALDI; Organelle; Image-Guided; Atrial gland; red hemiduct; Lucent Vesicle
published: 2021-07-10
This dataset containes the images of B73xMS71 RIL population used in QTL linkage mapping for maize epidermal traits in year 2016 and 2017. 2016RIL_all_mns.rar and 2017RIL_all_mns.rar: contain raw images produced by Nanofocus lsurf Explorer Optical Topometer (Oberhausen, Germany) at 20X magnification with 0.6 numerical aperture. Files were processed in Nanofocus μsurf analysis extended software (Oberhausen,Germany). 2016RIL_all_TIF.rar and 2017RIL_all_TIF.rar: contain images processed from the Topology layer in each nms file to strengthen the edges of cell outlines, and used in downstream cell detection. 2016RIL_all_detection_result.rar and 2017RIL_all_detection_result.rar: contain images with epidermal cells predicted using the Mask R-CNN model. training data.rar: contain images used for Mask R-CNN model training and validation.
keywords: stomata; Mask R-CNN; cell segmentation; water use efficiency
published: 2021-06-24
This dataset contains EEG and Temperature data acquired from inside the bore of an MRI scanner during scanning with two different types of fMRI sequences: single-band and and multi-band. The EEG data were acquired from the heads of adult humans undergoing scanning, and can be used to assess differences in EEG data quality due to sequence type. The temperature data were acquired from a watermelon phantom and can be used to assess heating differences due to sequence type.
keywords: Simultaneous EEG-fMRI, Multi-band fMRI, Safety, Heating
published: 2021-06-24
This dataset consists of the secondary ion mass spectrometry (SIMS) depth profiling data that was collected with a Cameca NanoSIMS 50 instrument from a 10 micron by 10 micron region on a Madin-Darby canine kidney (MDCK) cell that had been metabolically labeled so most of its sphingolipids and cholesterol contained the rare nitrogen-15 oxygen-18 isotopes, respectively.
keywords: secondary ion mass spectrometry; NanoSIMS; depth profiling; MDCK cell; sphingolipids; cholesterol
published: 2021-06-16
Thank you for using these datasets. These RNAsim aligned fragmentary sequences were generated from the query sequences selected by Balaban et al. (2019) in their variable-size datasets (https://doi.org/10.5061/dryad.78nf7dq). They were created for use for phylogenetic placement with the multiple sequence alignments and backbone trees provided by Balaban et al. (2019). The file structures included here also correspond with the data Balaban et al. (2020) provided. This includes: Directories for five varying backbone tree sizes, shown as 5000, 10000, 50000, 100000, and 200000. These directory names are also used by Balaban et al. (2019), and indicate the size of the backbone tree included in their data. Subdirectories for each replicate from the backbone tree size labelled 0 through 4. For the smaller four backbone tree sizes there are five replicates, and for the largest there is one replicate. Each replicate contains 200 text files with one aligned query sequence fragment in fasta format.
keywords: Fragmentary Sequences; RNAsim
published: 2021-06-17
Model output dataset (6-hourly) from the Weather Research and Forecasting (WRF) model simulations over South America with the added capability of water vapor tracers to track the moisture that originates over the Amazon and the La Plata river basins. The simulations were performed for the period 2003-2013 at 20-km horizontal resolution fully coupled with the Noah-MP land surface model. Limited number of original output variables sufficient for reproducing the analyses in papers that cite this dataset are included here. The attached wrfout_southamerica_readme.txt contains detailed information about the file format and variables. For the complete model dataset, contact francina@illinois.edu.
keywords: WRF; Amazon; La Plata; South America; Numerical tracers
published: 2021-06-14
This repository contains the weights for two StyleGAN2 networks trained on two composite T1 and T2 weighted open-source brain MR image datasets, and one StyleGAN2 network trained on the Flickr Face HQ image dataset. Example images sampled from the respective StyleGANs are also included. The datasets themselves are not included in this repository. The weights are stored as `.pkl` files. The code and instructions to load and use the weights can be found at https://github.com/comp-imaging-sci/pic-recon . Additional details and citations can be found in the file "README.md".
keywords: StyleGAN2; Generative adversarial network (GAN); MRI; Medical imaging
published: 2021-06-14
Chronic contact exposure to realistic soil concentrations (0, 7.5, 15, and 100 ppb) of the neonicotinoid pesticide imidacloprid had species- and sex-specific effects on adult bee movement characteristics, but not on adult female bee brain development. This dataset contains two data files. The first contains information about adult bee movement characteristics for female Osmia lignaria and female and male Megachile rotundata over a 10-minute trial (total distance traveled and average movement speed). The second contains information about female Osmia lignaria and Megachile rotundata adult brain morphology. Detected effects included: female Osmia lignaria adults moved faster as they aged in the 0 and 7.5 ppb, but not in the 15 or 100 ppb, groups; young male Megachile rotundata adults moved more quickly (7.5 and 100 ppb) and farther (100 ppb) when treated with imidacloprid compared to the control group (0 ppb); and, while there was no impact of imidacloprid on adult female neuropil:Kenyon cell volume (N:K), N:K decreased with Osmia ligaria adult age and increased with Megachile rotundata adult age.
keywords: neonicotinoid; imidacloprid; bee; movement
published: 2021-05-26
Steady-state and dynamic gas exchange data for maize (B73), sugarcane (CP88-1762) and sorghum (Tx430)
keywords: C4 plants; gas exchange
published: 2021-05-21
Data sets from "Inferring Species Trees from Gene-Family with Duplication and Loss using Multi-Copy Gene-Family Tree Decomposition." It contains trees and sequences simulated with gene duplication and loss under a variety of different conditions. <b>Note:</b> - trees.tar.gz contains the simulated gene-family trees used in our experiments (both true trees from SimPhy as well as trees estimated from alignements). - sequences.tar.gz contains simulated sequence data used for estimating the gene-family trees as well as the concatenation analysis. - biological.tar.gz contains the gene trees used as inputs for the experiments we ran on empirical data sets as well as species trees outputted by the methods we tested on those data sets. - stats.txt list statistics (such as AD, MGTE, and average size) for our simulated model conditions.
keywords: gene duplication and loss; species-tree inference; simulated data;