Displaying 251 - 275 of 656 in total

Subject Area

Life Sciences (353)
Social Sciences (136)
Physical Sciences (99)
Technology and Engineering (65)
Uncategorized (2)
Arts and Humanities (1)


Other (201)
U.S. National Science Foundation (NSF) (194)
U.S. Department of Energy (DOE) (68)
U.S. National Institutes of Health (NIH) (60)
U.S. Department of Agriculture (USDA) (43)
Illinois Department of Natural Resources (IDNR) (17)
U.S. National Aeronautics and Space Administration (NASA) (6)
U.S. Geological Survey (USGS) (6)
Illinois Department of Transportation (IDOT) (4)
U.S. Army (2)

Publication Year

2021 (108)
2022 (108)
2020 (96)
2023 (78)
2019 (72)
2018 (62)
2024 (57)
2017 (36)
2016 (30)
2025 (4)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)


CC0 (363)
CC BY (273)
custom (20)


published: 2021-02-26
These data were used in the survival and cause-specific mortality analyses of translocated nuisance American black bear in Wisconsin published in Animal Conservation (Bauder, J.M., N.M. Roberts, D. Ruid, B. Kohn, and M.L. Allen. Accepted. Lower survival of nuisance American black bears (Ursus americanus) is not due to translocation. Animal Conservation). Included are CSV files including each bear's capture history and associated covariates and meta-data for each CSV file. Also included is an example R script of how to conduct the analyses (this R script is also included as supporting information with the published paper).
keywords: black bear; survival; translocation; nuisance wildlife management
published: 2021-03-08
These are abundance dynamics data and simulations for the paper "Higher-order interaction between species inhibits bacterial invasion of a phototroph-predator microbial community". In this V2, data were converted in Python, in addition to MATLAB and more information on how to work with the data was included in the Readme.
keywords: Microbial community; Higher order interaction; Invasion; Algae; Bacteria; Ciliate
published: 2021-03-10
The PhytoplasmasRef_Trivellone_etal.fas fasta file contains the original final sequence alignment used in the phylogenetic analyses of Trivellone et al. (Ecology and Evolution, in review). The 27 sequences (21 phytoplasma reference strains and 6 phytoplasmas strains from the present study) were aligned using the Muscle algorithm as implemented in MEGA 7.0 with default settings. The final dataset contains 952 positions of the F2n/R2 fragment of the 16S rRNA gene. The data analyses are further described in the cited original paper.
keywords: Hemiptera; Cicadellidae; Mollicutes; Phytoplasma; biorepository
published: 2022-02-10
The compiled datasets include plot level observations of energy crops (miscanthus and switchgrass) from recent experimental field trials in the US including dry biomass yield, location, state, region, harvest year, growing season degree days (GDD), winter season heating degree days (HDD), growing season cumulative precipitation, annual nitrogen application rate, age of the pant when harvested, National Commodity Crop Productivity Index (NCCPI) values, and cultivar type (switchgrass) from various published and unpublished sources. The stata codes include estimation procedures for four different specifications, i.e., Model A includes deterministic effect without interaction terms; Model B includes deterministic effect with interaction terms (N2, age2, N × age, GDD2, precip2, N × NCCPI); Model C includes deterministic effect with interaction terms, study, and location random effect; Model D includes deterministic effect with interaction terms, harvest year augmented study, and location random effect.
keywords: Age; Miscanthus; Nitrogen; Switchgrass; Yield; Center for Advanced Bioenergy and Bioproducts Innovation
published: 2021-02-10
This dataset consists of microclimatic temperature and vegetation structure maps at a 3-meter spatial resolution across the Great Smoky Mountains National Park. Included are raster models for sub-canopy, near-surface, minimum and maximum temperature averaged across the study period, season, and month during the growing season months of March through November from 2006-2010. Also available are the topographic and vegetation inputs developed for the microclimate models, including LiDAR-derived vegetation height, LiDAR-derived vegetation structure within four height strata, solar insolation, distance-to-stream, and topographic convergence index (TCI).
keywords: microclimate buffering; forest vegetation structure; temperature; Appalachian Mountains; climate downscaling; understory; LiDAR
published: 2021-03-17
This dataset was developed as part of a study that assessed data reuse. Through bibliometric analysis, corresponding authors of highly cited papers published in 2015 at the University of Illinois at Urbana-Champaign in nine STEM disciplines were identified and then surveyed to determine if data were generated for their article and their knowledge of reuse by other researchers. Second, the corresponding authors who cited those 2015 articles were identified and surveyed to ascertain whether they reused data from the original article and how that data was obtained. The project goal was to better understand data reuse in practice and to explore if research data from an initial publication was reused in subsequent publications.
keywords: data reuse; data sharing; data management; data services; Scopus API
published: 2023-05-02
This dataset includes structural MRI head scans of 32 piglets, at 28 days of age, scanned at the University of Illinois. The dataset also includes manually drawn brain masks of each of the piglets. The dataset also includes brain masks that were generated automatically using Region-Based Convolutional Neural Networks (Mask R-CNN), trained on the manually drawn brain masks.
keywords: Brain extraction; Machine learning; MRI; Piglet; neural networks
published: 2021-10-11
This dataset contains the ClonalKinetic dataset that was used in SimiC and its intermediate results for comparison. The Detail description can be found in the text file 'clonalKinetics_Example_data_description.txt' and 'ClonalKinetics_filtered.DF_data_description.txt'. The required input data for SimiC contains: 1. ClonalKinetics_filtered.clustAssign.txt => cluster assignment for each cell. 2. ClonalKinetics_filtered.DF.pickle => filtered scRNAseq matrix. 3. ClonalKinetics_filtered.TFs.pickle => list of driver genes. The results after running SimiC contains: 1. ClonalKinetics_filtered_L10.01_L20.01_Ws.pickle => inferred GRNs for each cluster 2. ClonalKinetics_filtered_L10.01_L20.01_AUCs.pickle => regulon activity scores for each cell and each driver gene. <b>NOTE:</b> “ClonalKinetics_filtered.rds” file which is mentioned in “ClonalKinetics_filtered.DF_data_description.txt” is an intermediate file and the authors have put all the processed in the pickle/txt file as described in the filtered data text.
keywords: GRNs;SimiC;RDS;ClonalKinetic
published: 2021-10-10
This data set describes temperature, dissolved oxygen, and secchi depth in 1-m interval profiles in the deepest point in 10 Illinois reservoirs between the years 1995 and 2016.
keywords: Water temperature; dissolved oxygen; secchi depth; climate change
published: 2022-09-01
These data and code are associated with a study on differences in the rate of hatching failure of eggs across 14 free-living grassland and shrubland birds. We used a device to measure the embryonic heart rate of eggs and found there was variation across species related to factors such as nest type and nest safety. This work is to be published in Ornithology.
keywords: embryonic death; grassland birds; egg mortality; heart rate
published: 2021-08-12
This dataset contains the images of a photoperiod sensitive sorghum accession population used for a GWAS/TWAS study of leaf traits related to water use efficiency in 2016 and 2017. *<b>Note:</b> new in this second version is that JPG images outputted from the nms files were added <b>Accessions_2016.zip</b> and <b>Accessions_2017.zip</b>: contain raw images produced by Optical Topometer (nms files) for all sorghum accessions. Images can be opened with Nanofocus μsurf analysis extended software (Oberhausen,Germany). <b>Accessions_2016_jpg.zip</b> and <b>Accessions_2017_jpg.zip</b>: contain jpg images outputted from the nms files and used in the machine learning phenotyping.
keywords: stomata; segmentation; water use efficiency
published: 2021-08-20
In 2020, early-season extreme precipitation events occurred following the planting of Sorghum bicolor (L.) Moench and Zea mays L. in central Illinois that caused ponding. Following the first rainfall event 50m transects were established to assess the waterlogging effects on seedling emergence and crop yields. Soil moisture, emergence, stem and tiller count, LAI, and yield were measured at various points in the season along these transects.
keywords: Sorghum; Maize; Emergence; Yield; LAI
published: 2021-05-14
- The aim of this research was to evaluate the novel dietary fiber source, miscanthus grass, in comparison to traditional fiber sources, and their effects on the microbiota of healthy adult cats. Four dietary treatments, cellulose (CO), miscanthus grass fiber (MF), a blend of miscanthus fiber and tomato pomace (MF+TP), or beet pulp (BP) were evaluated.<br /><br />- The study was conducted using a completely randomized design with twenty-eight neutered adult, domesticated shorthair cats (19 females and 9 males, mean age 2.2 ± 0.03 yr; mean body weight 4.6 ± 0.7 kg, mean body condition score 5.6 ± 0.6). Total DNA from fresh fecal samples was extracted using Mo-Bio PowerSoil kits (MO BIO Laboratories, Inc., Carlsbad, CA). Amplification of the 292 bp-fragment of V4 region from the 16S rRNA gene was completed using a Fluidigm Access Array (Fluidigm Corporation, South San Francisco, CA). Paired-end Illumina sequencing was performed on a MiSeq using v3 reagents (Illumina Inc., San Diego, CA) at the Roy J. Carver Biotechnology Center at the University of Illinois. <br />- Filenames are composed of animal name identifier, diet (BP= beet pulp; CO= cellulose; MF= miscanthus grass fiber; TP= blend of miscanthus fiber and tomato pomace).
keywords: cats; dietary fiber; fecal microbiota; miscanthus grass; nutrient digestibility; postbiotics
published: 2021-04-22
Author-ity 2018 dataset Prepared by Vetle Torvik Apr. 22, 2021 The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018). A total of 29.1 million Article records and 114.2 million author name instances. Each instance of an author name is uniquely represented by the PMID and the position on the paper (e.g., 10786286_3 is the third author name on PMID 10786286). Thus, each cluster is represented by a collection of author name instances. The instances were first grouped into "blocks" by last name and first name initial (including some close variants), and then each block was separately subjected to clustering. The resulting clusters are provided in two different formats, the first in a file with only IDs and PMIDs, and the second in a file with cluster summaries: #################### File 1: au2id2018.tsv #################### Each line corresponds to an author name instance (PMID and Author name position) with an Author ID. It has the following tab-delimited fields: 1. Author ID 2. PMID 3. Author name position ######################## File 2: authority2018.tsv ######################### Each line corresponds to a predicted author-individual represented by cluster of author name instances and a summary of all the corresponding papers and author name variants. Each cluster has a unique Author ID (the PMID of the earliest paper in the cluster and the author name position). The summary has the following tab-delimited fields: 1. Author ID (or cluster ID) e.g., 3797874_1 represents a cluster where 3797874_1 is the earliest author name instance. 2. cluster size (number of author name instances on papers) 3. name variants separated by '|' with counts in parenthesis. Each variant of the format lastname_firstname middleinitial, suffix 4. last name variants separated by '|' 5. first name variants separated by '|' 6. middle initial variants separated by '|' ('-' if none) 7. suffix variants separated by '|' ('-' if none) 8. email addresses separated by '|' ('-' if none) 9. ORCIDs separated by '|' ('-' if none). From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 10. range of years (e.g., 1997-2009) 11. Top 20 most frequent affiliation words (after stoplisting and tokenizing; some phrases are also made) with counts in parenthesis; separated by '|'; ('-' if none) 12. Top 20 most frequent MeSH (after stoplisting) with counts in parenthesis; separated by '|'; ('-' if none) 13. Journal names with counts in parenthesis (separated by '|'), 14. Top 20 most frequent title words (after stoplisting and tokenizing) with counts in parenthesis; separated by '|'; ('-' if none) 15. Co-author names (lowercased lastname and first/middle initials) with counts in parenthesis; separated by '|'; ('-' if none) 16. Author name instances (PMID_auno separated by '|') 17. Grant IDs (after normalization; '-' if none given; separated by '|'), 18. Total number of times cited. (Citations are based on references harvested from open sources such as PMC). 19. h-index 20. Citation counts (e.g., for h-index): PMIDs by the author that have been cited (with total citation counts in parenthesis); separated by '|'
keywords: author name disambiguation; PubMed
published: 2021-05-07
The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018), and for ORCIDs, primarily, the 2019 ORCID Public Data File https://orcid.org/. Matching an ORCID to an individual author name on a PMID is a non-trivial process. Anyone can create an ORCID and claim to have contributed to any published work. Many records claim too many articles and most claim too few. Even though ORCID records are (most?) often populated by author name searches in popular bibliographic databases, there is no confirmation that the person's name is listed on the article. This dataset is the product of mapping ORCIDs to individual author names on PMIDs, even when the ORCID name does not match any author name on the PMID, and when there are multiple (good) candidate author names. The algorithm avoids assigning the ORCID to an article when there are no good candidates and when there are multiple equally good matches. For some ORCIDs that clearly claim too much, it triggers a very strict matching procedure (for ORCIDs that claim too much but the majority appear correct, e.g., 0000-0002-2788-5457), and sometimes deletes ORCIDs altogether when all (or nearly all) of its claimed PMIDs appear incorrect. When an individual clearly has multiple ORCIDs it deletes the least complete of them (e.g., 0000-0002-1651-2428 vs 0000-0001-6258-4628). It should be noted that the ORCIDs that claim to much are not necessarily due nefarious or trolling intentions, even though a few appear so. Certainly many are are due to laziness, such as claiming everything with a particular last name. Some cases appear to be due to test engineers (e.g., 0000-0001-7243-8157; 0000-0002-1595-6203), or librarians assisting faculty (e.g., ; 0000-0003-3289-5681), or group/laboratory IDs (0000-0003-4234-1746), or having contributed to an article in capacities other than authorship such as an Investigator, an Editor, or part of a Collective (e.g., 0000-0003-2125-4256 as part of the FlyBase Consortium on PMID 22127867), or as a "Reply To" in which case the identity of the article and authors might be conflated. The NLM has, in the past, limited the total number of authors indexed too. The dataset certainly has errors but I have taken great care to fix some glaring ones (individuals who claim to much), while still capturing authors who have published under multiple names and not explicitly listed them in their ORCID profile. The final dataset provides a "matchscore" that could be used for further clean-up. Four files: person.tsv: 7,194,692 rows, including header 1. orcid 2. lastname 3. firstname 4. creditname 5. othernames 6. otherids 7. emails employment.tsv: 2,884,981 rows, including header 1. orcid 2. putcode 3. role 4. start-date 5. end-date 6. id 7. source 8. dept 9. name 10. city 11. region 12 country 13. affiliation education.tsv: 3,202,253 rows, including header 1. orcid 2. putcode 3. role 4. start-date 5. end-date 6. id 7. source 8. dept 9. name 10. city 11. region 12 country 13. affiliation pubmed2orcid.tsv: 13,133,065 rows, including header 1. PMID 2. au_order (author name position on the article) 3. orcid 4. matchscore (see below) 5. source: orcid (2019 ORCID Public Data File https://orcid.org/), pubmed (NLMs distributed XML files), or patci (an earlier version of ORCID with citations processed through the Patci tool) 12,037,375 from orcid; 1,06,5892 from PubMed XML; 29,797 from Patci matchscore: 000: lastname, firstname and middle init match (e.g., Eric T MacKenzie vs 00: lastname, firstname match (e.g., Keith Ward) 0: lastname, firstname reversed match (e.g., Conde Santiago vs Santiago Conde) 1: lastname, first and middle init match (e.g., L. F. Panchenko) 11: lastname and partial firstname match (e.g., Mike Boland vs Michael Boland or Mel Ziman vs Melanie Ziman) 12: lastname and first init match 15: 3 part lastname and firstname match (David Grahame Hardie vs D Grahame Hardie) 2: lastname match and multipart firstname initial match Maria Dolores Suarez Ortega vs M. D. Suarez 22: partial lastname match and firstname match (e.g., Erika Friedmann vs Erika Friedman) 23: e.g., Antonio Garcia Garcia vs A G Garcia 25: Allan Downie vs J A Downie 26: Oliver Racz vs Oliver Bacz 27: Rita Ostrovskaya vs R U Ostrovskaia 29: Andrew Staehelin vs L A Staehlin 3: M Tronko vs N D Tron'ko 4: Sharon Dent (Also known as Sharon Y.R. Dent; Sharon Y Roth; Sharon Yoder) vs Sharon Yoder 45: Okulov Aleksei vs A B Okulov 48: Maria Del Rosario Garcia De Vicuna Pinedo vs R Garcia-Vicuna 49: Anatoliy Ivashchenko vs A Ivashenko 5 = lastname match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Bill Hieb vs W F Hieb 6 = first name match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Maria Borawska vs Maria Koscielak 7 = last or first name match on "other names"; e.g., Hromokovska Tetiana (Also known as Gromokovskaia, T. S., Громоковська Тетяна) vs T Gromokovskaia 77: Siva Subramanian vs Kolinjavadi N. Sivasubramanian 88 = no name in orcid but match caught by uniqueness of name across paper (at least 90% and 2 more than next most common name) prefix: C = ambiguity reduced (possibly eliminated) using city match (e.g., H Yang on PMID 24972200) I = ambiguity eliminated by excluding investigators (ie.., one author and one or more investigators with that name) T = ambiguity eliminated using PubMed pos (T for tie-breaker) W = ambiguity resolved by authority2018
published: 2021-05-09
Raw data and its analysis collected from a trial designed to test the impact of providing a Bacillus-based direct-fed microbial (DFM) on the syndrome resulting from orally infecting pigs with either Salmonella enterica serotype Choleraesuis (S. Choleraesuis) alone, or in combination with an intranasal challenge, three days later, with porcine reproductive and respiratory syndrome virus (PRRSV).
keywords: excel file
published: 2021-06-28
This dataset contains 1) the cleaned version of 11 CRW datasets, 2) RNASim10k dataset in high fragmentation and 3) three CRW datasets (16S.3, 16S.T, 16S.B.ALL) in high fragmentation.
keywords: MAGUS;UPP;Multiple Sequence Alignment;PASTA;eHMMs
published: 2021-08-15
This data set contains mass spectrometry data used for the publication "mspack: efficient lossless and lossy mass spectrometry data compression".
keywords: mass-spectrometry data; compression; proteomics
published: 2022-01-31
This dataset contains results from WRF simulations over northern South America. The Orinoco Low-Level Jet (OLLJ) and the Cross-Equatorial Moisture Transport are important circulation structures of the climate of tropical South America. We explore the sensitivity of the OLLJ and cross-equatorial transport to the representation of surface fluxes and turbulence by using two different Land Surface Model (LSM) schemes (Noah and CLM) and three Planetary Boundary Layer (PBL) schemes (YSU, QNSE and MYNN).
keywords: WRF; Orinoco LLJ; preicpitation
published: 2021-05-07
- The objective of this study was to evaluate macronutrient apparent total tract digestibility (ATTD), gastrointestinal tolerance, and fermentative end-products in extruded, canine diets. <br />- Five diets were formulated to be isocaloric and isonitrogenous with either garbanzo beans (GBD), green lentils (GLD), peanut flour (PFD), dried yeast (DYD), or poultry by-product meal (CON) as the primary protein sources. Ten adult, intact, female beagles (mean age: 4.2 ± 1.1 yr, mean 28 weight: 11.9 ± 1.3 kg) were used in a replicated, 5x5 Latin square design with 14 d periods. Total DNA from fresh fecal samples was extracted using Mo-Bio PowerSoil kits (MO BIO Laboratories, Inc., Carlsbad, CA). Amplification of the 292 bp-fragment of V4 region from the 16S rRNA gene was completed using a Fluidigm Access Array (Fluidigm Corporation, South San Francisco, CA). Paired-end Illumina sequencing was performed on a MiSeq using v3 reagents (Illumina Inc., San Diego, CA) at the Roy J. Carver Biotechnology Center at the University of Illinois. <br />- Filenames are composed of animal name identifier, diet (CON=control; DY= dried yeast; GB= garbanzo beans; GL= green lentils; PF= peanut flour) and period replicate number (P1, P2, P3, P4, and P5).
keywords: Dog; Digestibility; Legume; Microbiota; Pulse; Yeast
published: 2021-05-10
This dataset contains data used in publication "Institutional Data Repository Development, a Moving Target" submitted to Code4Lib Journal. It is a tabular data file describing attributes of data files in datasets published in Illinois Data Bank 2016-04-01 to 2021-04-01.
keywords: institutional repository
published: 2021-07-20
This dataset contains data from extreme-disagreement analysis described in paper “Aaron M. Cohen, Jodi Schneider, Yuanxi Fu, Marian S. McDonagh, Prerna Das, Arthur W. Holt, Neil R. Smalheiser, 2021, Fifty Ways to Tag your Pubtypes: Multi-Tagger, a Set of Probabilistic Publication Type and Study Design Taggers to Support Biomedical Indexing and Evidence-Based Medicine.” In this analysis, our team experts carried out an independent formal review and consensus process for extreme disagreements between MEDLINE indexing and model predictive scores. “Extreme disagreements” included two situations: (1) an abstract was MEDLINE indexed as a publication type but received low scores for this publication type, and (2) an abstract received high scores for a publication type but lacked the corresponding MEDLINE index term. “High predictive score” is defined as the top 100 high-scoring, and “low predictive score” is defined as the bottom 100 low-scoring. Three publication types were analyzed, which are CASE_CONTROL_STUDY, COHORT_STUDY, and CROSS_SECTIONAL_STUDY. Results were recorded in three Excel workbooks, named after the publication types: case_control_study.xlsx, cohort_study.xlsx, and cross_sectional_study.xlsx. The analysis shows that, when the tagger gave a high predictive score (>0.9) on articles that lacked a corresponding MEDLINE indexing term, independent review suggested that the model assignment was correct in almost all cases (CROSS_SECTIONAL_STUDY (99%), CASE_CONTROL_STUDY (94.9%), and COHORT STUDY (92.2%)). Conversely, when articles received MEDLINE indexing but model predictive scores were very low (<0.1), independent review suggested that the model assignment was correct in the majority of cases: CASE_CONTROL_STUDY (85.4%), COHORT STUDY (76.3%), and CROSS_SECTIONAL_STUDY (53.6%). Based on the extreme disagreement analysis, we identified a number of false-positives (FPs) and false-negatives (FNs). For case control study, there were 5 FPs and 14 FNs. For cohort study, there were 7 FPs and 22 FNs. For cross-sectional study, there were 1 FP and 45 FNs. We reviewed and grouped them based on patterns noticed, providing clues for further improving the models. This dataset reports the instances of FPs and FNs along with their categorizations.
keywords: biomedical informatics; machine learning; evidence based medicine; text mining
published: 2022-01-27
Twenty-two genotypes of C4 species grown under ambient and elevated O3 concentration were studied at the SoyFACE (40°02’N, 88°14’W) in 2019. This dataset contains leaf morphology, photosynthesis and nutrient contents measured at three time points. The results of CO2 response curves are also included.
keywords: C4, O3, photosynthesis
published: 2023-02-07
This dataset includes supporting data for our article 'Assessing long-term impacts of cover crops on soil organic carbon in the central U.S. Midwestern agroecosystems'. The dataset contains carbon fluxes and SOC benefits from cover crops at six cover crop experiment sites in Illinois with three rotation systems: (1) without-cover-crop (maize-soybean rotations), (2) non-legume-preceding-maize (maize-annual ryegrass-soybean-annual ryegrass rotations), and (3) legume-preceding-maize (maize-cereal rye-soybean-hairy vetch rotations). <b>*NOTE:</b> there should be 13 files + 1 readme file, instead of 15 files as mentioned in readme.
keywords: Soil organic carbon; cover crops
published: 2023-05-02
Tab-separated value (TSV) file. 14745 data rows. Each data row represents publication metadata as retrieved from Crossref (http://crossref.org) 2023-04-05 when searching for retracted publications. Each row has the following columns: Index - Our index, starting with 0. DOI - Digital Object Identifier (DOI) for the publication Year - Publication year associated with the DOI. URL - Web location associated with the DOI. Title - Title associated with the DOI. May be blank. Author - Author(s) associated with the DOI. Journal - Publication venue (journal, conference, ...) associated with the DOI RetractionYear - Retraction Year associated with the DOI. May be blank. Category - One or more categories associated with the DOI. May be blank. Our search was via the Crossref REST API and searched for: Update_type=( 'retraction', 'Retraction', 'retracion', 'retration', 'partial_retraction', 'withdrawal','removal')
keywords: retraction; metadata; Crossref; RISRS