Displaying datasets 26 - 50 of 510 in total

Subject Area

Life Sciences (276)
Social Sciences (116)
Physical Sciences (71)
Technology and Engineering (44)
Uncategorized (2)
Arts and Humanities (1)


U.S. National Science Foundation (NSF) (151)
Other (145)
U.S. National Institutes of Health (NIH) (51)
U.S. Department of Energy (DOE) (49)
U.S. Department of Agriculture (USDA) (24)
Illinois Department of Natural Resources (IDNR) (11)
U.S. National Aeronautics and Space Administration (NASA) (5)
U.S. Geological Survey (USGS) (5)
U.S. Army (2)
Illinois Department of Transportation (IDOT) (1)

Publication Year

2021 (109)
2022 (106)
2020 (96)
2019 (72)
2018 (59)
2017 (35)
2016 (30)
2023 (3)


CC0 (294)
CC BY (203)
custom (13)
published: 2022-09-07
The availability of economically marginal land for energy crops is identified using the Cropland Data Layer and other soil, wind, climate data resources. All data are recognized on a 30m spatial resolution across the continental United States.
keywords: marginal land; biofuel production; remote sensing; land use change; Cropland Data Layer
published: 2022-09-01
These data and code are associated with a study on differences in the rate of hatching failure of eggs across 14 free-living grassland and shrubland birds. We used a device to measure the embryonic heart rate of eggs and found there was variation across species related to factors such as nest type and nest safety. This work is to be published in Ornithology.
keywords: embryonic death; grassland birds; egg mortality; heart rate
published: 2022-08-31
This dataset includes data on soil properties, soil N pools, and soil N fluxes presented in the manuscript, "Refining the role of nitrogen mineralization in mycorrhizal nutrient syndromes". Please refer to that publication for details about methodologies used to generate these data and for the experimental design. For this verison 2, we added specific gross nitrogen mineralization rates (ugN/gOM/d), microbial biomass carbon (ugC/gdw), microbial biomass nitrogen (ugN/gdw) and microbial biomass C:N ratios to the newest version of the data set. Additionally, we updated values for gross nitrogen mineralization, microbial NO3 assimilation and microbial NH4 assimilation to reflect slight changes in data processing. Those changes are reflected in "220829_All data_repository.csv". "220829_nitrogen_mineralization_readme.txt " is updated readme for the new file. The other 2 files begin with “220426_” are older version and same as in V1.
keywords: Nitrogen cycling; Ectomycorrhizal fungi; Arbuscular mycorrhizal fungi; Nitrogen fertilization; Gross mineralization
published: 2022-08-31
These datasets are for the four-dimensional scanning transmission electron microscopy (4D-STEM) and electron energy loss spectroscopy (EELS) experiments for cathode nanoparticles at different cutoff voltages and in different electrolytes. The raw 4D-STEM experiment datasets were collected by TEM image & analysis software (FEI) and were saved as SER files. The raw 4D-STEM datasets of SER files can be opened and viewed in MATLAB using our analysis software package of imToolBox available at <a href="https://github.com/flysteven/imToolBox">https://github.com/flysteven/imToolBox</a>. The raw EELS datasets were collected by DigitalMicrograph software and were saved as DM4 files. The raw EELS datasets can be opened and viewed in DigitalMicrograph software or using our analysis codes available at <a href="https://github.com/chenlabUIUC/OrientedPhaseDomain">https://github.com/chenlabUIUC/OrientedPhaseDomain</a>. All the datasets are from the work "Formation and impact of nanoscopic oriented phase domains in electrochemical crystalline electrodes" (2022). The 4D-STEM experiment data include four example datasets for cathode nanoparticles collected at different cutoff voltages and in different electrolytes as described below. Each dataset contains a stack of diffraction patterns collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine cathode particle: "Pristine particle 4D-STEM.ser" 2. Cathode particle at the cutoff voltage of 0.09V during discharge at C/10 in the aqueous electrolyte: "Intermediate cutoff0_09V discharge (aqueous) 4D-STEM.ser" 3. Fully discharged cathode particle at C/10 in the aqueous electrolyte: "Fully discharged particle 4D-STEM.ser" 4. Fully discharged cathode particle at C/10 in the dry organic electrolyte: "Fully discharge particle (dry organic electrolyte).ser" The EELS experiment data includes three example datasets for cathode nanoparticles collected at different cutoff voltages during discharge in the aqueous electrolyte (in "EELS datasets.zip") as described below. Each EELS dataset contains the zero-loss and core-loss EELS spectra collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine cathode particle: "Pristine particle EELS.zip" 2. Cathode particle at the cutoff voltage of 0.09V during discharge at C/10 in the aqueous electrolyte: "intermediate discharge (aqueous) EELS.zip" 3. Fully discharged cathode particle at C/10 in the aqueous electrolyte: "fully discharge (aqueous) EELS.zip" The details of the software package and codes that can be used to analyze the 4D-STEM datasets and EELS datasets are available at: https://github.com/chenlabUIUC/OrientedPhaseDomain. Once our paper is formally published, we will update the relationship of these datasets with our paper.
keywords: 4D-STEM; microstructure; phase transformation; strain; cathode; nanoparticle; energy storage
published: 2022-08-29
Example scripts and configuration files needed to perform select simulations described in the manuscript "Percolation transition prescribes protein size-specific barrier to passive transport through the nuclear pore complex."
keywords: Nuclear Pore Complex; simulation setup
published: 2022-09-14
Datasets that accompany Beilke and O'Keefe 2022 publication (Title: Bats reduce insect density and defoliation in temperate forests: an exclusion experiment; Journal: Ecology).
keywords: bats; defoliation; ecosystem services; forests, insectivory; insects; trophic cascades
published: 2022-08-25
Data in this publication were used to analyze the factors that influence the abundance of eastern whip-poor-wills in the Midwest and to describe the diet of this species. These data were collected in Illinois in 2019 and 2020. Procedures were approved by the Illinois Institutional Animal Care and Use Committee (IACUC), protocol no. 19006
keywords: eastern whip-poor-will; Antrostomus vociferus; abundance; moths; nightjars; Lepidoptera; metabarcoding
published: 2022-08-23
This dataset contains soil chemical properties used to variation in soil fungal communities beneath Oreomunnea mexicana trees in the manuscript "Watershed-scale variation in potential fungal community contributions to ectomycorrhizal biogeochemical syndromes"
keywords: Acid-base chemistry; Ectomycorrhizal fungi; Exploration type; Nitrogen cycling; Nitrogen isotopes; Plant-soil (below-ground) interactions; Saprotrophic fungi; Tropical forest
published: 2021-02-23
Coups d'état are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup D'état Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e. realized or successful coups, unrealized coup attempts, or thwarted conspiracies) the type of actor(s) who initiated the coup (i.e. military, rebels, etc.), as well as the fate of the deposed leader. This is version 2.0.0 of this dataset. The first version, <a href="https://clinecenter.illinois.edu/project/research-themes/democracy-and-development/coup-detat-project">v.1.0.0</a>, was released in 2013. Since then, the Cline Center has taken several steps to improve on the previously-released data. These changes include: <ol> <li>Filling in missing event data values</li> <li>Removing events with no identifiable dates</li> <li>Reconciling event dates from sources that have conflicting information</li> <li>Removing events with insufficient sourcing (each event now has at least two sources)</li> <li>Removing events that were inaccurately coded and did not meet our definition of a coup event</li> <li>Extending the time period covered from 1945-2005 to 1945-2019</li> <li>Removing certain variables that fell below the threshold of inter-coder reliability required by the project</li> <li>The spreadsheet ‘CoupInventory.xls’ was removed because of inadequate attribution and citation in the event summaries</li></ol> <b>Items in this Dataset</b> 1. <i>CDP v.2.0.2 Codebook.pdf</i> <ul><li>This 14-page document provides a description of the Cline Center Coup D’état Project Dataset. The first section of this codebook provides a succinct definition of a coup d’état used by the project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The second section describes the methodology used to produce the data. <i>Created November 2020. Revised February 2021 to add some additional information about how the Cline Center edited some values in the COW country codes."</i> </li></ul> 2. <i>Coup_Data_v2.0.0.csv</i> <ul><li>This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup D’etat Project. It contains 29 variables and 943 observations. <i>Created November 2020</i></li></ul> 3. <i>Source Document v2.0.0.pdf</i> <ul><li>This 305-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify each particular event. <i>Created November 2020</i> </li></ul> 4. <i>README.md</i> <ul><li>This file contains useful information for the user about the dataset. It is a text file written in mark down language. <i>Created November 2020</i> </li></ul> <br> <b> Citation Guidelines</b> 1) To cite this codebook please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, and Jonathan Bonaguro. 2021. “Cline Center Coup D’état Project Dataset Codebook”. Cline Center Coup D’état Project Dataset. Cline Center for Advanced Social Research. V.2.0.2. February 23. University of Illinois Urbana-Champaign. doi: <a href="https://doi.org/10.13012/B2IDB-9651987_V2">10.13012/B2IDB-9651987_V3</a> 2) To cite the data please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, and Jonathan Bonaguro. 2020. Cline Center Coup D’état Project Dataset. Cline Center for Advanced Social Research. V.2.0.0. November 16. University of Illinois Urbana-Champaign. doi: <a href="https://doi.org/10.13012/B2IDB-9651987_V2">10.13012/B2IDB-9651987_V3</a>
keywords: Coup d'état; event data; Cline Center; Cline Center for Advanced Social Research; political science
published: 2022-09-29
Dataset associated with Merrill et al. ECE-2021-05-00793.R1 submission: Early life patterns of growth are linked to levels of phenotypic trait covariance and post-fledging mortality across avian species. Excel CSV files with all of the data used in analyses and file with descriptions of each column.
keywords: canalization; developmental flexibility; early-life stress; nest predation; phenotypic correlation; trait covariance
published: 2022-07-22
Data in this publication were used to examine the effects of environmental and temporal covariates on detection probability, and the effects of habitat and landscape level covariates on occupancy and within season turnover of Black-billed Cuckoos and Yellow-billed Cuckoos. Data were collected between 2019-2020 in northern Illinois, USA. Procedures were approved by the Illinois Institutional Animal Care and Use Committee (IACUC), protocol no. 19086.
keywords: Black-billed Cuckoo; call broadcast; Coccyzus americanus; Coccyzus erythropthalmus; detection probability; occupancy dynamics; rare and secretive species; Yellow-billed Cuckoo
published: 2022-08-05
Simulated sequences provide a way to evaluate multiple sequence alignment (MSA) methods where the ground truth is exactly known. However, the realism of such simulated conditions often comes under question compared to empirical datasets. In particular, simulated data often does not display heterogeneity in the sequence lengths, a common feature in biological datasets. In order to imitate sequence length heterogeneity, we here present a set of data that are evolved under a mixture model of indel lengths, where indels have an occasional chance of being promoted to long indels (emulating large insertion/deletion events, e.g., domain-level gain/loss). This dataset is otherwise (e.g., in GTR parameters) analogous to the 1000M condition as presented in the SATe paper (doi: 10.1126/science.1171243) but with 5000 sequences and simulated with INDELible (http://abacus.gene.ucl.ac.uk/software/indelible/). For more information, see README.txt. For the INDELible control files, see https://github.com/ThisBioLife/5000M-234-het.
keywords: simulated data; sequence length heterogeneity; multiple sequence alignment;
published: 2022-08-08
This upload contains all datasets used in Experiments 2 and 3 of the SALMA paper (pending submission): Shen, Chengze, Baqiao Liu, Kelly P. Williams, and Tandy Warnow. "SALMA: Scalable ALignment using MAFFT-Add". The zip file has the following structure (presented as an example): salma_paper_datasets/ |_README.md |_10aa/ |_crw/ |_homfam/ |_aat/ | |_... |_... |_het/ |_5000M2-het/ | |_... |_5000M3-het/ ... |_rec_res/ Generally, the structure can be viewed as: [category]/[dataset]/[replicate]/[alignment files] # Categories: 1. 10aa: There are 10 small biological protein datasets within the `10aa` directory, each with just one replicate. 2. crw: There are 5 selected CRW datasets, namely 5S.3, 5S.E, 5S.T, 16S.3, and 16S.T, each with one replicate. These are the cleaned version from Shen et. al. 2022 (MAGUS+eHMM). 3. homfam: There are the 10 largest Homfam datasets, each with one replicate. 4. het: There are three newly simulated nucleotide datasets from this study, 5000M2-het, 5000M3-het, and 5000M4-het, each with 10 replicates. 5. rec\_res: It contains the Rec and Res datasets. Detailed dataset generation can be found in the supplementary materials of the paper. # Alignment files There are at most 6 `.fasta` files in each sub-directory: 1. `all.unaln.fasta`: All unaligned sequences. 2. `all.aln.fasta`: Reference alignments of all sequences. If not all sequences have reference alignments, only the sequences that have will be included. 3. `all-queries.unaln.fasta`: All unaligned query sequences. Query sequences are sequences that do not have lengths within 25% of the median length (i.e., not full-length sequences). 4. `all-queries.aln.fasta`: Reference alignments of query sequences. If not all queries have reference alignments, only the sequences that have will be included. 5. `backbone.unaln.fasta`: All unaligned backbone sequences. Backbone sequences are sequences that have lengths within 25% of the median length (i.e., full-length sequences). 6. `backbone.aln.fasta`: Reference alignments of backbone sequences. If not all backbone sequences have reference alignments, only the sequences that have will be included. >If all sequences are full-length sequences, then `all-queries.unaln.fasta` will be missing. >If fewer than two query sequences have reference alignments, then `all-queries.aln.fasta` will be missing. >If fewer than two backbone sequences have reference alignments, then `backbone.aln.fasta` will be missing. # Additional file(s) 1. `350378genomes.txt`: the file contains all 350,378 bacterial and archaeal genome names that were used by Prodigal (Hyatt et. al. 2010) to search for protein sequences.
keywords: SALMA;MAFFT;alignment;eHMM;sequence length heterogeneity
published: 2022-08-22
This dataset contains Raman spectra, each acquired from an individual, living, primary murine cell belonging to one of the six most immature hematopoietic cell populations found in the body: hematopoietic stem cell (HSC), mutipotent progenitor 1 (MPP1), multipotent progenitor 2 (MPP2), multipotent progenitor 3 (MPP3), common lymphoid progenitor, common myeloid progenitor (CLP). These spectra are useful for identifying spectral signatures that are characteristic of each hematopoietic stem or early progenitor cell population. *NOTE: __MACOSX folder and files start with “._[file name]” found in "Raman spectra of single cells text files.zip" were created by the computer operation system, in unreadable format, which are not part of the data and can be removed/ignored when using the data.
keywords: Raman spectroscopy; single-cell spectrum; hematopoietic cell; hematopoietic stem cell; multipotent progenitor cell; common myeloid progenitor; common lymphoid progenitor
published: 2022-08-20
Dataset associated with Jones and Ward BEAS-D-21-00106R2 submission: Parasitic cowbird development up to fledging and subsequent post-fledging survival reflect life history variation found across host species. Excel CSV files and .inp file with data used in nest survival and Brown-headed Cowbird post-fledging analyses and file with descriptions of each column. The CSV file is setup for logistic exposure models in SAS or R and the .inp file is setup to be uploaded into program MARK for multi-state recaptures only analysis. Species included in the analyses: American Robin, Blue Grosbeak, Brown Thrasher, Blue-winged Warbler, Carolina Chickadee, Chipping Sparrow, Common Yellowthroat, Dickcissel, Eastern Bluebird, Eastern Phoebe, Eastern Towhee, Field Sparrow, Gray Catbird, House Wren, Indigo Bunting, Northern Cardinal, Red-winged Blackbird, Tree Swallow, Yellow-breasted Chat, and Yellow Warbler.
keywords: brood parasitism; cowbird; carryover effects; phenotypic plasticity; post-fledging; songbirds
published: 2022-01-10
The Cline Center Global News Index is a searchable database of textual features extracted from millions of news stories, specifically designed to provide comprehensive coverage of events around the world. In addition to searching documents for keywords, users can query metadata and features such as named entities extracted using Natural Language Processing (NLP) methods and variables that measure sentiment and emotional valence. Archer is a web application purpose-built by the Cline Center to enable researchers to access data from the Global News Index. Archer provides a user-friendly interface for querying the Global News Index (with the back-end indexing still handled by Solr). By default, queries are built using icons and drop-down menus. More technically-savvy users can use Lucene/Solr query syntax via a ‘raw query’ option. Archer allows users to save and iterate on their queries, and to visualize faceted query results, which can be helpful for users as they refine their queries. <b>Additional Resources:</b> - Access to Archer and the Global News Index is limited to account-holders. If you are interested in signing up for an account, please fill out the <a href="https://docs.google.com/forms/d/e/1FAIpQLSf-J937V6I4sMSxQt7gR3SIbUASR26KXxqSurrkBvlF-CIQnQ/viewform?usp=pp_url"><b>Archer Access Request Form</b></a> so we can determine if you are eligible for access or not. - Current users who would like to provide feedback, such as reporting a bug or requesting a feature, can fill out the <a href="https://forms.gle/6eA2yJUGFMtj5swY7"><b>Archer User Feedback Form</b></a>. - The Cline Center sends out periodic email newsletters to the Archer Users Group. Please fill out this <a href="https://groups.webservices.illinois.edu/subscribe/123172"><b>form</b></a> to subscribe to it. <b>Citation Guidelines:</b> 1) To cite the GNI codebook (or any other documentation associated with the Global News Index and Archer) please use the following citation: Cline Center for Advanced Social Research. 2022. Global News Index and Extracted Features Repository [codebook], v1.1.0. Champaign, IL: University of Illinois. Dec. 16. doi:10.13012/B2IDB-5649852_V3 2) To cite data from the Global News Index (accessed via Archer or otherwise) please use the following citation (filling in the correct date of access): Cline Center for Advanced Social Research. 2022. Global News Index and Extracted Features Repository [database], v1.1.0. Champaign, IL: University of Illinois. Dec. 16. Accessed Month, DD, YYYY. doi:10.13012/B2IDB-5649852_V3
keywords: Cline Center; Cline Center for Advanced Social Research; political; social; political science; Global News Index; Archer; news; mass communication; journalism;
published: 2022-03-25
This upload includes the 16S.B.ALL in 100-HF condition (referred to as 16S.B.ALL-100-HF) used in Experiment 3 of the WITCH paper (currently accepted in principle by the Journal of Computational Biology). 100-HF condition refers to making sequences fragmentary with an average length of 100 bp and a standard deviation of 60 bp. Additionally, we enforced that all fragmentary sequences to have lengths > 50 bp. Thus, the final average length of the fragments is slightly higher than 100 bp (~120 bp). In this case (i.e., 16S.B.ALL-100-HF), 1,000 sequences with lengths 25% around the median length are retained as "backbone sequences", while the remaining sequences are considered "query sequences" and made fragmentary using the "100-HF" procedure. Backbone sequences are aligned using MAGUS (or we extract their reference alignment). Then, the fragmentary versions of the query sequences are added back to the backbone alignment using either MAGUS+UPP or WITCH. More details of the tar.gz file are described in README.txt.
keywords: MAGUS;UPP;Multiple Sequence Alignment;eHMMs
published: 2022-08-06
This dataset consists of all the files and codes that are part of the manuscript (main text and supplement) titled "Spin-selective tunneling from nanowires of the candidate topological Kondo insulator SmB6". For detailed information on the individual files refer to the specific readme files.
keywords: Topology; Kondo Inuslator; Spin; Scanning tunneling microscopy; antiferromagnetism
has sharing link
published: 2022-08-06
An online knowledge, attitudes, and practices survey on ticks and tick-borne diseases was distributed to medical professionals in Illinois during summer 2020 to fall 2021. These are the raw data associated with that survey and the survey questions used. Age, gender, and county of practice have been removed for identifiability. We have added calculated values (columns 165 to end), including: the tick knowledge score, TBD knowledge score, and total knowledge score, which are the sum of the total number of correct answers in each category, and score percent, which are the proportion of correct answers in each category; region, which is determined from the county of practice; TBD relevant practice, which separates the practice variable into TBD primary, secondary, and non-responders; and several variables which group categories.
keywords: ticks; medicine; tick-borne disease; survey
published: 2022-08-05
This data set documents bat activity (counts per detector-night per phonic group) and bat diversity (number of bat species per detector-night) in relation to distance to the nearest forested corridor in a row crop agriculture dominated landscape and in relation to relative crop pest abundance. This data set was used to assess if bats were homogeneously distributed over a near-uninterrupted agricultural landscape and to assess the importance of forested corridors and the presence of pest species on their distribution across the landscape. Data was collected with 50 AudioMoth bat detectors along 10 transects, with each transect having 5 detectors. The transects started at a forest corridor and extended out for 4 km into uninterrupted row crop agriculture. Pest abundance was extrapolated from data collected in the same county during the same time as the study. Potentially important weather covariates were extracted from the nearest operational weather station.
keywords: bats; bat activity; biodiversity; agricultural pest
published: 2022-08-01
Datasets that accompany Shearer and Beilke 2022 publication (Title: Playing it by ear: gregarious sparrows recognize and respond to isolated wingbeat sounds and predator-based cues.; Journal: Animal Cognition)
keywords: Vigilance; auditory detection; predator detection; predator-prey interaction; antipredator behavior
published: 2022-07-25
Related to the raw entity mentions, this dataset represents the effects of the data cleaning process and collates all of the entity mentions which were too ambiguous to successfully link to the NCBI's taxonomy identifier system.
keywords: synthetic biology; NERC data; species mentions, ambiguous entities
published: 2022-07-25
A set of species entity mentions derived from an NERC dataset analyzing 900 synthetic biology articles published by the ACS. This data is associated with the Synthetic Biology Knowledge System repository (https://web.synbioks.org/). The data in this dataset are raw mentions from the NERC data.
keywords: synthetic biology; NERC data; species mentions
published: 2022-07-25
This dataset represents the results of manual cleaning and annotation of the entity mentions contained in the raw dataset (https://doi.org/10.13012/B2IDB-4950847_V1). Each mention has been consolidated and linked to an identifier for a matching concept from the NCBI's taxonomy database.
keywords: synthetic biology; NERC data; species mentions; cleaned data; NCBI TaxonID
published: 2022-07-25
This dataset is derived from the raw dataset (https://doi.org/10.13012/B2IDB-4950847_V1) and collects entity mentions that were manually determined to be noisy, non-species entities.
keywords: synthetic biology; NERC data; species mentions, noisy entities