Displaying datasets 76 - 100 of 625 in total

Subject Area

Life Sciences (328)
Social Sciences (132)
Physical Sciences (91)
Technology and Engineering (62)
Uncategorized (11)
Arts and Humanities (1)

Funder

Other (188)
U.S. National Science Foundation (NSF) (186)
U.S. Department of Energy (DOE) (62)
U.S. National Institutes of Health (NIH) (59)
U.S. Department of Agriculture (USDA) (39)
Illinois Department of Natural Resources (IDNR) (17)
U.S. Geological Survey (USGS) (6)
U.S. National Aeronautics and Space Administration (NASA) (5)
Illinois Department of Transportation (IDOT) (4)
U.S. Army (2)

Publication Year

2021 (108)
2022 (108)
2020 (96)
2023 (78)
2019 (72)
2018 (62)
2017 (36)
2016 (30)
2024 (28)
2025 (2)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)

License

CC0 (347)
CC BY (258)
custom (20)
published: 2016-12-12
 
This dataset includes data of the the Wax Lake delta from four public agencies: NGDC, USGS, NDBC, and NOAA CO-OPS. Besides the original data, the processed data associated with analyzed figures are also shared.
keywords: Wax Lake delta; NOAA CO-OPS; NGDC; USGS; NDBC
published: 2016-12-18
 
This dataset is the numerical simulation data of the computational study of the cold front-related hydrodynamics in the Wax Lake delta. The numerical model used is ECOM-si.
keywords: Wax Lake delta; Hydrodynamics; Cold front
published: 2019-08-13
 
Multiple sequence alignments from concatenated nuclear and mitochondrial genes and resulting phylogenetic tree files of fruit doves and their close relatives. Files include: BEAST input XML file (fruit_dove_beast_input.xml); a maximum clade credibility tree from a BEAST analysis (fruit_dove_beast_mcc.tre); concatenated multiple sequence alignment NEXUS files for the novel dataset (fruit_dove_concatenated_alignment.nex, 76 taxa, 4,277 characters) and the dataset with additional sequences (fruit_dove_plus_cibois_data_concatenated_alignment.nex, 204 taxa, 4,277 characters), both of which contain a MrBayes block including partition information; and 50% majority-rule consensus trees generated from MrBayes analyses, using the NEXUS alignment files as inputs (fruit_dove_mrbayes_consensus.tre, fruit_dove_plus_cibois_data_mrbayes_consensus.tre).
keywords: fruit doves; multiple sequence alignment; phylogeny; Aves: Columbidae
published: 2019-09-17
 
Trained models for multi-task multi-dataset learning for text classification in tweets. Classification tasks include sentiment prediction, abusive content, sarcasm, and veridictality. Models were trained using: <a href="https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_classification.py">https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_classification.py</a> See <a href="https://github.com/socialmediaie/SocialMediaIE">https://github.com/socialmediaie/SocialMediaIE</a> and <a href="https://socialmediaie.github.io">https://socialmediaie.github.io</a> for details. If you are using this data, please also cite the related article: Shubhanshu Mishra. 2019. Multi-dataset-multi-task Neural Sequence Tagging for Information Extraction from Tweets. In Proceedings of the 30th ACM Conference on Hypertext and Social Media (HT '19). ACM, New York, NY, USA, 283-284. DOI: https://doi.org/10.1145/3342220.3344929
keywords: twitter; deep learning; machine learning; trained models; multi-task learning; multi-dataset learning; sentiment; sarcasm; abusive content;
published: 2019-09-17
 
Trained models for multi-task multi-dataset learning for sequence tagging in tweets. Sequence tagging tasks include POS, NER, Chunking, and SuperSenseTagging. Models were trained using: <a href="https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_experiment.py">https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_experiment.py</a> See <a href="https://github.com/socialmediaie/SocialMediaIE">https://github.com/socialmediaie/SocialMediaIE</a> and <a href="https://socialmediaie.github.io">https://socialmediaie.github.io</a> for details. If you are using this data, please also cite the related article: Shubhanshu Mishra. 2019. Multi-dataset-multi-task Neural Sequence Tagging for Information Extraction from Tweets. In Proceedings of the 30th ACM Conference on Hypertext and Social Media (HT '19). ACM, New York, NY, USA, 283-284. DOI: https://doi.org/10.1145/3342220.3344929
keywords: twitter; deep learning; machine learning; trained models; multi-task learning; multi-dataset learning;
published: 2019-09-17
 
BAM files for evolved strains from migration rate selection experiments conducted in low viscosity (0.2% w/v) agar plates containing M63 minimal medium with 1mM of mannose, melibiose, N-acetylglucosamine or galactose
published: 2019-09-01
 
Agriculture has substantial socioeconomic and environmental impacts that vary between crops. However, information on how the spatial distribution of specific crops has changed over time across the globe is relatively sparse. We introduce the Probabilistic Cropland Allocation Model (PCAM), a novel algorithm to estimate where specific crops have likely been grown over time. Specifically, PCAM downscales annual and national-scale data on the crop-specific area harvested of 17 major crops to a global 0.5-degree grid from 1961-2014. The resulting database presented here provides annual global gridded likelihood estimates of crop-specific areas. Both mean and standard deviations of grid cell fractions are available for each of the 17 crops. Each netCDF file contains an individual year of data with an additional variable ("crs") that defines the coordinate reference system used. Our results provide new insights into the likely changes in the spatial distribution of major crops over the past half-century. For additional information, please see the related paper by Jackson et al. (2019) in Environmental Research Letters (https://doi.org/10.1088/1748-9326/ab3b93).
keywords: global; gridded; probabilistic allocation; crop suitability; agricultural geography; time series
published: 2019-10-03
 
Dataset for F2F events of honeybees. F2F events are defined as face-to-face encounters of two honeybees that are close in distance and facing each other but not connected by the proboscis, thus not engaging in trophallaxis. The first and the second columns show the unique id's of honeybees participating in F2F events. The third column shows the time at which the F2F event started while the fourth column shows the time at which it ended. Each time is in the Unix epoch timestamp in milliseconds.
keywords: honeybee;face-to-face interaction
published: 2019-10-15
 
Filtered trophallaxis interactions for two honeybee colonies, each containing 800 worker bees and one queen. Each colony consists of bees that were administered a juvenile hormone analogy, a vehicle treatment, or a sham treatment to determine the effect of colony perturbation on the duration of trophallaxis interactions. Columns one and two display the unique identifiers for each bee involved in a particular trophallaxis exchange, and columns three and four display the Unix timestamp of the beginning/end of the interaction (in milliseconds), respectively.<br /><b>Note</b>: the queen interactions were omitted from the uploaded dataset for reasons that are described in submitted manuscript. Those bees that performed poorly are also omitted from the final dataset.
keywords: honey bee; trophallaxis; social network
published: 2019-10-19
 
Large, distributed microphone arrays could offer dramatic advantages for audio source separation, spatial audio capture, and human and machine listening applications. This dataset contains acoustic measurements and speech recordings from 10 loudspeakers and 160 microphones spread throughout a large, reverberant conference room. The distributed microphone system contains two types of array: four wearable microphone arrays of 16 sensors each placed near the ears and across the upper body, and twelve tabletop arrays of 8 microphones each in enclosures designed to resemble voice-assistant speakers. The dataset includes recordings of chirps that can be used to measure impulse responses and of speech clips derived from the CSTR VCTK corpus. The speech clips are recorded both individually and as a mixture to support source separation experiments. The uncompressed files are about 13.4 GB.
keywords: microphone arrays; audio source separation; augmented listening; wireless sensor networks
published: 2019-10-23
 
Raw MD simulation trajectory, input and configuration files, SEM current data, and experimental raw data accompanying the publication, "Electrical recognition of the twenty proteinogenic amino acids using an aerolysin nanopore". README.md contains a description of all associated files.
keywords: molecular dynamics; protein sequencing; aerolysin; nanopore sequencing
published: 2019-10-05
 
This dataset contains collected and aggregated network information from NCSA’s Blue Waters system, which is comprised of 27,648 nodes connected via Cray Gemini* 3D torus (dimension 24x24x24) interconnect, from Jan/01/2017 to May/31/2017. Network performance counters for links are exposed via Cray's gpcdr (<a href="https://github.com/ovis-hpc/ovis/wiki/gpcdr-kernel-module">https://github.com/ovis-hpc/ovis/wiki/gpcdr-kernel-module</a>) kernel module. Lightweight Distributed Metric Service ([LDMS](<a href="https://github.com/ovis-hpc/ovis">https://github.com/ovis-hpc/ovis</a>)) is used to sampled the performance counters at 60 second intervals. Please read "README.md" file. <b>Acknowledgement:</b> This dataset is collected as a part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation and the state of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications.
keywords: HPC; Interconnect; Network; Congestion; Blue Waters; Dataset
published: 2019-12-20
 
This dynamic photosynthesis model of soybean canopy is developed by Yu Wang (yuwangcn@illinois.edu), IGB, University of Illinois. If you want to know more details, please check the following publication Yu Wang, Steven J. Burgess, Elsa de Becker, Stephen P. Long. Photosynthesis in the fleeting shadows: An overlooked opportunity for increasing crop productivity? The Plant Journal.
keywords: Matlab; Soybean canopy; photosynthesis model
published: 2019-08-29
 
This is part of the Cline Center’s ongoing Social, Political and Economic Event Database Project (SPEED) project. Each observation represents an event involving civil unrest, repression, or political violence in Sierra Leone, Liberia, and the Philippines (1979-2009). These data were produced in an effort to describe the relationship between exploitation of natural resources and civil conflict, and to identify policy interventions that might address resource-related grievances and mitigate civil strife. This work is the result of a collaboration between the US Army Corps of Engineers’ Construction Engineer Research Laboratory (ERDC-CERL), the Swedish Defence Research Agency (FOI) and the Cline Center for Advanced Social Research (CCASR). The project team selected case studies focused on nations with a long history of civil conflict, as well as lucrative natural resources. The Cline Center extracted these events from country-specific articles published in English by the British Broadcasting Corporation (BBC) Summary of World Broadcasts (SWB) from 1979-2008 and the CIA’s Foreign Broadcast Information Service (FBIS) 1999-2004. Articles were selected if they mentioned a country of interest, and were tagged as relevant by a Cline Center-built machine learning-based classification algorithm. Trained analysts extracted nearly 10,000 events from nearly 5,000 documents. The codebook—available in PDF form below—describes the data and production process in greater detail.
keywords: Cline Center for Advanced Social Research; civil unrest; Social Political Economic Event Dataset (SPEED); political; event data; war; conflict; protest; violence; social; SPEED; Cline Center; Political Science
planned publication date: 2025-01-23
 
These are the responses to an open, convenience sample survey of residents of Illinois to understand their interactions with wild deer. The survey was available on REDCap between December 19, 2022 and December 19, 2023, and was publicized through listserves, Facebook groups, and media reporting. The file "COVID Deer Survey _ REDCap.pdf" contains the codebook for the survey, including the questions; all factor variables have ".factor" added to their name in the dataset. The file "DeerSurveyData.csv" contains the dataset. The file "Score_calculation_for_sharing.R" is the code to create the cleaned dataset used for analysis from the raw survey responses. Throughout, NA is used to represent null/not available/not applicable; this is most likely either a failure to answer the question or, in some cases, a question that was not presented as it is not relevant based on answers to previous questions.
keywords: deer; survey
published: 2024-01-18
 
The following files include specimen information, DNA sequence data, and additional information on the analyses used to reconstruct the phylogeny of the leafhopper genus Neoaliturus as described in the Methods section of the original paper: 1. Taxon_sampling.csv: contains data on the individual specimens from which DNA was extracted, including sample code, taxon name, collection data (locality, date and name of collector) and museum unique identifier. 2. Alignments.zip: a ZIP archive containing 432 separate FASTA files representing the aligned nucleotide sequences of individual gene loci used in the analysis. 3. Concatenated_Matrix.fa: is a FASTA file containing the concatenated individual gene alignments used for the maximum likelihood analysis in IQ-TREE. 4. Genes_and_Loci.rtf: identifies the individual genes and loci used in the analysis. The partition name is the same as the name of the individual alignment file in the zipped Alignments folder. 5. Partitions_best_scheme.nex: is a text file in the standard NEXUS format that indicates the names of the individual data partitions and their locations in the concatenated matrix, and also indicates the substitution model for each partition.
keywords: leafhopper; phylogeny; anchored-hybrid-enrichment; DNA sequence; insect
published: 2024-01-04
 
This data set includes all of data related to stretchable TFTs based on 2D heterostructures including optical images of TFTs, Raman and Photoluminescence characteristics data, Transport measurement data, and AFM topography data. Abstract Two-dimensional (2D) materials are outstanding candidates for stretchable electronics, but a significant challenge is their heterogeneous integration into stretchable geometries on soft substrates. Here, we demonstrate a strategy for stretchable thin film transistors (2D S-TFT) based on wrinkled heterostructures on elastomer substrates where 2D materials formed the gate, source, drain, and channel, and characterized them with Raman spectroscopy and transport measurements.
keywords: 2D materials; 2D heterstructures; Stretchable electronics; transistors; buckling engineering
published: 2016-12-19
 
Files in this dataset represent an investigation into use of the Library mobile app Minrva during the months of May 2015 through December 2015. During this time interval 45,975 API hits were recorded by the Minrva web server. The dataset included herein is an analysis of the following: 1) a delineation of API hits to mobile app modules use in the Minrva app by month, 2) a general analysis of Minrva app downloads to module use, and 3) the annotated data file providing associations from API hits to specific modules used, organized by month (May 2015 – December 2015).
keywords: API analysis; log analysis; Minrva Mobile App
published: 2024-01-04
 
This is a collection of 31 quasi-linear convective system (QLCS) mesovortices (MVs) that were manually identified and analyzed using the lowest elevation scan of the nearest relevant Weather Surveillance Radar–1988 Doppler (WSR-88D) during the two years (springs of 2022 and 2023) of the Propagation, Evolution, and Rotation in Linear Storms (PERiLS) field campaign. Throughout the two years of PERiLS, a total of nine intensive observing periods (IOPs) occurred (see https://catalog.eol.ucar.edu/perils_2022/missions and https://catalog.eol.ucar.edu/perils_2023/missions for exact IOP dates/times). However, only six of these IOPs (specifically, IOPs 2, 3, and 4 from both years) are included in this dataset. The inclusion criteria were based on the presence of strictly QLCS MVs within the C-band On Wheels (COW) domain, one of the research radars deployed in the field for the PERiLS project. Further details on how MVs were identified are provided below. This analysis was completed using the Gibson Ridge radar-viewing software (GR2Analyst). Each MV had to be produced by a QLCS, defined as a continuous area of 35 dBZ radar reflectivity over at least 100 km when viewed from the lowest elevation scan. The MVs analyzed also had to pass through/near the COW’s domain at some point during their lifetimes to allow for additional analysis using the COW data. Tornadic (TOR), wind-damaging (WD), and non-damaging (ND) MVs were analyzed. ND MVs were ones that usually had a tornado warning placed on them but did not produce any damage and persisted for five or more radar scans; this was done to target the strongest MVs that forecasters thought could be tornadic. The QLCS MVs were identified using objective criteria, which included the existence of a circulation with a maximum differential velocity (dV; i.e., the difference between the maximum outbound and minimum inbound velocities at a constant range) of at least 20 kt over a distance ≤ 7 km. The following radar-based characteristics were catalogued for each QLCS MV at the lowest elevation angle of the nearest WSR-88D: latitude and longitude locations of the MV, the genesis to decay time of the MV, the maximum dV across the MV, the maximum rotational velocity (Vrot; i.e., dV divided by two), diameter of the MV, the range from the radar of the MV center, and the height above radar level of the MV center. In the Excel sheet, there are a total of 37 sheets. 32 of the 37 sheets are for each MV that was examined. One of those MVs (sheet titled 'EFU_tor_iop3') was not included in the final count of MVs (31). This MV produced an EFU tornado and only tornadoes that were given ratings were used to calculate MV statistics. The 31 MV sheets that were used to calculate MV statistics are labeled following the convention 'mv#_iop#_qlcs'. ‘mv#’ is the unique number that was assigned to each MV for clear identification, 'iop#' is the IOP in which the MV occurred, 'qlcs' denotes that the MV was produced by a QLCS, and the 2023 IOPs are denoted by ‘_2023’ after ‘qlcs’ in the sheet name. In these sheets, there are notes on what was visually seen in the radar data, damage associated with each MV (using the National Centers for Environmental Information (NCEI) database), and the characteristics of the MV at each time step of its lifetime. The yellow rows in each of the sheets indicate the last row of data included in the pretornadic, predamaging (wind damage), and pre-nondamaging statistics. The orange boxes in the notes column indicate any reports that were in NCEI but not in GR2Analyst. There are also sheets that examine pretornadic and predamaging diameter trends, box and whisker plot statistics of the overall characteristics of the different types of MVs, and the overall characteristics of each MV, with one Excel sheet (‘combined_qlcs_mvs’) examining the characteristics of each MV over its entire lifetime and one Excel sheet (‘combined_qlcs_mvs_before_report’) examining the characteristics of each MV before it first produced damage or had a tornado warning placed on it.
keywords: quasi-linear convective system; QLCS; tornado; radar; mesovortex; PERiLS; low-level rotation; tornadic; nontornadic; wind-damaging; Propagation, Evolution, and Rotation in Linear Storms; tornado warning; C-band On Wheels
published: 2023-12-23
 
Supplemental document corresponding to a submission to Physiological Genomics (Data supplements and source materials must now be deposited in a community-recognized data repository or to a generalist public access repository if no community resource is available. See "Author/Production Requirements" for more information.) https://pg.msubmit.net/
keywords: Supplemental, Physiological Genomics
published: 2021-05-17
 
Please cite as: Wuebbles, D., J. Angel, K. Petersen, and A.M. Lemke, (Eds.), 2021: An Assessment of the Impacts of Climate Change in Illinois. The Nature Conservancy, Illinois, USA. https://doi.org/10.13012/B2IDB-1260194_V1 Climate change is a major environmental challenge that is likely to affect many aspects of life in Illinois, ranging from human and environmental health to the economy. Illinois is already experiencing impacts from the changing climate and, as climate change progresses and temperatures continue to rise, these impacts are expected to increase over time. This assessment takes an in-depth look at how the climate is changing now in Illinois, and how it is projected to change in the future, to provide greater clarity on how climate change could affect urban and rural communities in the state. Beyond providing an overview of anticipated climate changes, the report explores predicted effects on hydrology, agriculture, human health, and native ecosystems.
keywords: Climate change; Illinois; Public health; Agriculture; Environment; Water; Hydrology; Ecosystems
published: 2023-07-27
 
The text file contains the original aligned DNA nucleotide sequence data used in the phylogenetic analyses of Feng et al. (in review), comprising the 3 protein-coding genes (histone H3, cytochrome oxidase I and 2) and 2 ribosomal genes (28S D8 and 16S). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 257 taxa (species) and 2995 characters (nucleotide positions), indicate that the characters are DNA sequence, that gaps inserted into the DNA sequence alignment are indicated by a dash, and that missing data are indicated by a question mark. The remainder of the file contains the aligned nucleotide sequence data for the five genes. Data partitions, representing the individual genes and different codon positions of the protein-coding genes, are indicated by the lines beginning "charset" near the end of the file. Two supplementary tables in the provided PDF file provide additional information on the species in the dataset, including the GenBank accession numbers for the sequence data (Table S1) and the DNA substitution models used for each of the data partitions used for analyses in the phylogenetic analysis program IQ-Tree (version 1.6.8) (Table S3), as described in the Methods section of the paper. The supplemental tables will also be linked to the article upon publication at the journal website.
keywords: Insect; leafhopper; dispersal; vicariance; evolution
published: 2023-12-08
 
A two-year field study was conducted to test the hypothesis that biochar application increases inorganic soil N availability during maize growth, leading to higher grain yields and N recovery efficiency while reducing the risk of N leaching following harvest. Four N fertilizer rates (0, 90, 179, and 269 kg ha-1 as urea ammonium nitrate solution) were applied with or without biochar (10 Mg ha-1) before maize planting each year. This dataset contains selected summary statistics (average and standard deviation) on soil and plant measurements. This file package also includes a readme.txt file that describes the data in detail, including attribute descriptions.
keywords: biochar; nitrogen fertilizer; nitrogen use efficiency; corn yield, soil inorganic nitrogen; nitrate leaching