Displaying datasets 301 - 325 of 522 in total

Subject Area

Life Sciences (282)
Social Sciences (119)
Physical Sciences (72)
Technology and Engineering (46)
Uncategorized (2)
Arts and Humanities (1)


U.S. National Science Foundation (NSF) (153)
Other (149)
U.S. National Institutes of Health (NIH) (52)
U.S. Department of Energy (DOE) (49)
U.S. Department of Agriculture (USDA) (25)
Illinois Department of Natural Resources (IDNR) (11)
U.S. National Aeronautics and Space Administration (NASA) (5)
U.S. Geological Survey (USGS) (5)
U.S. Army (2)
Illinois Department of Transportation (IDOT) (1)

Publication Year

2022 (112)
2021 (108)
2020 (96)
2019 (72)
2018 (59)
2017 (35)
2016 (30)
2023 (10)


CC0 (298)
CC BY (210)
custom (14)
published: 2020-06-12
This is a network of 14 systematic reviews on the salt controversy and their included studies. Each edge in the network represents an inclusion from one systematic review to an article. Systematic reviews were collected from Trinquart (Trinquart, L., Johns, D. M., & Galea, S. (2016). Why do we think we know what we know? A metaknowledge analysis of the salt controversy. International Journal of Epidemiology, 45(1), 251–260. https://doi.org/10.1093/ije/dyv184 ). <b>FILE FORMATS</b> 1) Article_list.csv - Unicode CSV 2) Article_attr.csv - Unicode CSV 3) inclusion_net_edges.csv - Unicode CSV 4) potential_inclusion_link.csv - Unicode CSV 5) systematic_review_inclusion_criteria.csv - Unicode CSV 6) Supplementary Reference List.pdf - PDF <b>ROW EXPLANATIONS</b> 1) Article_list.csv - Each row describes a systematic review or included article. 2) Article_attr.csv - Each row is the attributes of a systematic review/included article. 3) inclusion_net_edges.csv - Each row represents an inclusion from a systematic review to an article. 4) potential_inclusion_link.csv - Each row shows the available evidence base of a systematic review. 5) systematic_review_inclusion_criteria.csv - Each row is the inclusion criteria of a systematic review. 6) Supplementary Reference List.pdf - Each item is a bibliographic record of a systematic review/included paper. <b>COLUMN HEADER EXPLANATIONS</b> <b>1) Article_list.csv:</b> ID - Numeric ID of a paper paper assigned ID - ID of the paper from Trinquart et al. (2016) Type - Systematic review / primary study report Study Groupings - Groupings for related primary study reports from the same report, from Trinquart et al. (2016) (if applicable, otherwise blank) Title - Title of the paper year - Publication year of the paper Attitude - Scientific opinion about the salt controversy from Trinquart et al. (2016) Doi - DOIs of the paper. (if applicable, otherwise blank) Retracted (Y/N) - Whether the paper was retracted or withdrawn (Y). Blank if not retracted or withdrawn. <b>2) Article_attr.csv:</b> ID - Numeric ID of a paper year - Publication year Attitude - Scientific opinion about the salt controversy from Trinquart et al. (2016) Type - Systematic review/ primary study report <b>3) inclusion_net_edges.csv:</b> citing_ID - The numeric ID of a systematic review cited_ID - The numeric ID of the included articles <b>4) potential_inclusion_link.csv:</b> This data was translated from the Sankey diagram given in Trinquart et al. (2016) as Web Figure 4. Each row indicates a systematic review and each column indicates a primary study. In the matrix, "p" indicates that a given primary study had been published as of the search date of a given systematic review. <b>5)systematic_review_inclusion_criteria.csv:</b> ID - The numeric IDs of systematic reviews paper assigned ID - ID of the paper from Trinquart et al. (2016) attitude - Its scientific opinion about the salt controversy from Trinquart et al. (2016) No. of studies included - Number of articles included in the systematic review Study design - Study designs to include, per inclusion criteria population - Populations to include, per inclusion criteria Exposure/Intervention - Exposures/Interventions to include, per inclusion criteria outcome - Study outcomes required for inclusion, per inclusion criteria Language restriction - Report languages to include, per inclusion criteria follow-up period - Follow-up period required for inclusion, per inclusion criteria
keywords: systematic reviews; evidence synthesis; network visualization; tertiary studies
published: 2018-11-18
This dataset contains experimental measurements used in the paper, "Ultra-sensitivity of Numerical Landscape Evolution Models to their Initial Conditions." (to be submitted). The data is taken from experimental runs in a miniature landscape model named the eXperimental Landscape Evolution (XLE) facility. In this facility, we complete five >24hr runs at 5 minute temporal resolution. Every five minutes, an planform image was capture, and a digital elevation model (DEM) was generated. For each run, images and a corresponding animation of images are documented. In addition,ASCII formatted DEMs along with color hillshade maps were generated. The hillshade map images were also made into an animation. This dataset is associated with the following publication: https://doi.org/10.1029/2019GL083305
keywords: landscape evolution model; digital elevation model; geomorphology
published: 2020-06-02
The text file contains the original data used in the phylogenetic analyses of Xue et al. (2020: Systematic Entomology, in press). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 89 taxa (species) and 2676 characters, indicate that the first 2590 characters are DNA sequence and the last 86 are morphological, that gaps inserted into the DNA sequence alignment and inapplicable morphological characters are indicated by a dash, and that missing data are indicated by a question mark. The file contains aligned nucleotide sequence data for 5 gene regions and 86 morphological characters. The positions of data partitions are indicated in the mrbayes block of commands for the phylogenetic program MrBayes at the end of the file (Subset1 = 16S gene; Subset2 = 28S gene; Subset3 = COI gene; Subset 4 = Histone H3 and H2A genes). The mrbayes block also contains instructions for MrBayes on various non-default settings for that program. These are explained in the original publication. Descriptions of the morphological characters and more details on the species and specimens included in the dataset are provided in the supplementary document included as a separate pdf, also available from the journal website. The original raw DNA sequence data are available from NCBI GenBank under the accession numbers indicated in the supplementary file.
keywords: phylogeny; DNA sequence; morphology; Insecta; Hemiptera; Cicadellidae; leafhopper; evolution; 28S rDNA; 16S rDNA; histone H3; histone H2A; cytochrome oxidase I; Bayesian analysis
published: 2020-06-03
This datasets provide basis of our analysis in the paper - Potential Impacts of Supersonic Aircraft on Stratospheric Ozone and Climate. All datasets here can be categorized into emission data and model output data (WACCM). All the model simulations (background and perturbation) were run to steady-state and only the datasets used in analysis are archived here.
keywords: NetCDF; Supersonic aircraft; Stratospheric ozone; Climate
published: 2020-06-03
This dataset provides files for use in analysis of human land preference across Australasia, and in a localized analysis of land preference in Laos and Vietnam. All files can be imported into ArcGIS for visualization, and re-analyzed using the open source Maxent species distribution modeling program. CSV files contain known human presence sites for model validation. ASC files contain geographically coded environmental data for mean annual temperature and mean annual precipitation during the Last Glacial Maximum, as well as downward slope data. All ASC files are in the WGS 1984 Mercator map projection for visualization in ArcGIS and can be opened as text files in text editors supporting large file sizes.
keywords: human dispersal; ecological niche modeling; Australasia; Late Pleistocene; land preference
published: 2020-05-31
This repository includes a simulated dataset and related scripts used for the paper "Moss: Accurate Single-Nucleotide Variant Calling from Multiple Bulk DNA Tumor Samples".
keywords: Somatic Mutations; Bulk DNA Sequencing; Cancer Genomics
published: 2020-05-30
Original leaf gas exchange and absorptance data used in the Collison et al. (2020) Light, Not Age, Underlies the Q9 Maladaptation of Maize and Miscanthus Photosynthesis to Self-Shading - Frontiers in Plant Science doi: 10.3389/fpls.2020.00783
keywords: C4 photosynthesis; canopy; bioenergy; food security; quantum yield; shade acclimation; photosynthetic light-use efficiency; leaf aging
published: 2019-11-11
This repository includes scripts and datasets for the paper, "FastMulRFS: Fast and accurate species tree estimation under generic gene duplication and loss models." Note: The results from estimating species trees with ASTRID-multi (included in this repository) are *not* included in the FastMulRFS paper. We estimated species trees with ASTRID-multi in the fall of 2019, but ASTRID-multi had an important bug fix in January 2020. Therefore, the ASTRID-multi species trees in this repository should be ignored.
keywords: Species tree estimation; gene duplication and loss; statistical consistency; MulRF, FastRFS
published: 2020-05-17
Models and predictions for submission to TRAC - 2020 Second Workshop on Trolling, Aggression and Cyberbullying Our approach is described in our paper titled: Mishra, Sudhanshu, Shivangi Prasad, and Shubhanshu Mishra. 2020. “Multilingual Joint Fine-Tuning of Transformer Models for Identifying Trolling, Aggression and Cyberbullying at TRAC 2020.” In Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying (TRAC-2020). The source code for training this model and more details can be found on our code repository: https://github.com/socialmediaie/TRAC2020 NOTE: These models are retrained for uploading here after our submission so the evaluation measures may be slightly different from the ones reported in the paper.
keywords: Social Media; Trolling; Aggression; Cyberbullying; text classification; natural language processing; deep learning; open source;
published: 2020-05-20
This dataset is a snapshot of the presence and structure of entrepreneurship education in U.S. four-year colleges and universities in 2015, including co-curricular activities and related infrastructure. Public, private not-for-profit and for-profit institutions are included, as are specialized four-year institutions. The dataset provides insight into the presence of entrepreneurship education both within business units and in other units of college campuses. Entrepreneurship is defined broadly, to include small business management and related career-focused options.
keywords: Entrepreneurship education; Small business education; Ewing Marion Kauffman Foundation; csv
published: 2020-05-15
Trained models for multi-task multi-dataset learning for sequence prediction in tweets Tasks include POS, NER, Chunking, and SuperSenseTagging Models were trained using: https://github.com/napsternxg/SocialMediaIE/blob/master/experiments/multitask_multidataset_experiment.py See https://github.com/napsternxg/SocialMediaIE for details.
keywords: twitter; deep learning; machine learning; trained models; multi-task learning; multi-dataset learning;
published: 2020-05-15
This data has tweets collected in paper Shubhanshu Mishra, Sneha Agarwal, Jinlong Guo, Kirstin Phelps, Johna Picco, and Jana Diesner. 2014. Enthusiasm and support: alternative sentiment classification for social movements on social media. In Proceedings of the 2014 ACM conference on Web science (WebSci '14). ACM, New York, NY, USA, 261-262. DOI: https://doi.org/10.1145/2615569.2615667 The data only contains tweet IDs and the corresponding enthusiasm and support labels by two different annotators.
keywords: Twitter; text classification; enthusiasm; support; social causes; LGBT; Cyberbullying; NFL
published: 2020-05-13
Terrorism is among the most pressing challenges to democratic governance around the world. The Responsible Terrorism Coverage (or ResTeCo) project aims to address a fundamental dilemma facing 21st century societies: how to give citizens the information they need without giving terrorists the kind of attention they want. The ResTeCo hopes to inform best practices by using extreme-scale text analytic methods to extract information from more than 70 years of terrorism-related media coverage from around the world and across 5 languages. Our goal is to expand the available data on media responses to terrorism and enable the development of empirically-validated models for socially responsible, effective news organizations. This particular dataset contains information extracted from terrorism-related stories in the New York Times published between 1945 and 2018. It includes variables that measure the relative share of terrorism-related topics, the valence and intensity of emotional language, as well as the people, places, and organizations mentioned. This dataset contains 3 files: 1. <i>"ResTeCo Project NYT Dataset Variable Descriptions.pdf"</i> <ul> <li>A detailed codebook containing a summary of the Responsible Terrorism Coverage (ResTeCo) Project New York Times (NYT) Dataset and descriptions of all variables. </li> </ul> 2. <i>"resteco-nyt.csv"</i> <ul><li>This file contains the data extracted from terrorism-related media coverage in the New York Times between 1945 and 2018. It includes variables that measure the relative share of topics, sentiment, and emotion present in this coverage. There are also variables that contain metadata and list the people, places, and organizations mentioned in these articles. There are 53 variables and 438,373 observations. The variable "id" uniquely identifies each observation. Each observation represents a single news article. </li> <li> <b>Please note</b> that care should be taken when using "respect-nyt.csv". The file may not be suitable to use in a spreadsheet program like Excel as some of the values get to be quite large. Excel cannot handle some of these large values, which may cause the data to appear corrupted within the software. It is encouraged that a user of this data use a statistical package such as Stata, R, or Python to ensure the structure and quality of the data remains preserved.</li> </ul> 3. <i>"README.md"</i> <ul><li>This file contains useful information for the user about the dataset. It is a text file written in mark down language</li> </ul> <b>Citation Guidelines</b> 1) To cite this codebook please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project New York Times (NYT) Dataset Variable Descriptions. Responsible Terrorism Coverage (ResTeCo) Project New York Times Dataset. Cline Center for Advanced Social Research. May 13. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-4638196_V1 2) To cite the data please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project New York Times Dataset. Cline Center for Advanced Social Research. May 13. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-4638196_V1
keywords: Terrorism, Text Analytics, News Coverage, Topic Modeling, Sentiment Analysis
published: 2020-05-12
The data provided herein is accelerometer and strain data taken from free vibration response of pre-tensioned, partially submerged steel beam specimens (modulus of elasticity assumed = 29,000 ksi). The specimens were subjected to various levels of pre-tension, and various levels of submersion in water. The purpose of the testing was to quantify the effects of partial submersion on the vibrating frequencies of pretensioned beams. Three specimens were tested, each with different cross section (but identical cross-sectional area). The different cross sections allow investigation of the effects of specimen width as the specimen vibrates through water. The testing procedure was as follows: 1) Apply a specified level of tension in the beam. Measure tension via 3 strain gages. 2) Submerge the specimens to a specified depth of water 3) Excite the beams with either a hammer impact or a pull-and-release method (physically pull the middle of the bar and quickly release) 4) Measure the free vibration of the beam with 2 accelerometers. Schematic drawings of the test setup and the test specimens are provided, as is a picture of the test setup.
keywords: free vibration; beam; partially-submerged; prestressed;
published: 2020-04-02
Automatic and manual counts of black flies captured in Illinois.
keywords: black flies; simuliids; ImageJ; count method
published: 2020-04-22
Nest survival and Fledgling production data for Bell's Vireo and Willow Flycatcher nests.
keywords: Bell's Vireo;Willow Flycatcher;habitat selection;fitness;
published: 2017-12-22
TBP assessment raw data files of pre- and post- motion capture velocity and center of pressure force plate data. Labels are self-explanatory. The .mat files refer to data exported from the force plate for the time-to-stabilization assessments while the .txt files are the data collected for smoothness of gait assessments. These files do not relate to one another and are from separate assessments. Version2's files are the result from using Python code Data_Bank_Cleaner.py on version1's. Please find more information in READ_ME_databank.txt.
keywords: Multiple Sclerosis; Rehabilitation; Balance; Ataxia; Ballet; Dance; Targeted Ballet Program
published: 2020-04-20
Supplemental data sets for the Manuscript entitled "Contribution of fungal and invertebrate communities to mass loss and wood depolymerization in tropical terrestrial and aquatic habitats"
keywords: Coiba Island; wood decomposition; cellulose; hemicellulose; lignin breakdown; aquatic fungi
published: 2020-04-06
Raw measurement data for umbilical remnants (umbilical vein, umbilical arteries and urachus) in support of Equine Veterinary Journal publication "Normal Regression of the Internal Umbilical Remnant Structures in Standardbred Foals."
keywords: equine; umbilicus; ultrasound
published: 2020-04-07
Baseline data from a multi-modal intervention study conducted at the University of Illinois at Urbana-Champaign. Data include results from a cardiorespiratory fitness assessment (maximal oxygen consumption, VO2max), a body composition assessment (Dual-Energy X-ray Absorptiometry, DXA), and Magnetic Resonance Spectroscopy Imaging. Data set includes data from 435 participants, ages 18-44 years.
keywords: Magnetic Resonance Spectroscopy; N-acetyl aspartic acid (NAA); Body Mass Index; cardiorespiratory fitness; body composition
published: 2020-03-14
Data on bank elevations determined from lidar data for the Upper Sangamon River, Illinois, the Mission River, Texas, and the White River in Indiana
keywords: bank elevations, rivers, meandering, lowland
published: 2020-03-13
Data files associated with the assembly of mitochondrial minicircles from five species of parasitic lice. This includes data from four species in the genus Columbicola and from the human louse (Pediculus humanus). The files include FASTA sequences for all five species, reference sequences for read mapping approaches, resulting contigs produced by various assembly approaches, and alignments of human louse minicircles mapped to published sequences of the same species.
keywords: mitochondria; FASTA; nucleotide sequences; alignment; Columbicola; Pediculus
published: 2020-03-08
This dataset inventories the availability of entrepreneurship and small business education, including co-curricular opportunities, in two-year colleges in the United States. The inventory provides a snapshot of activities at more than 1,650 public, not-for-profit, and private for-profit institutions, in 2014.
keywords: Small business education; entrepreneurship education; Kauffman Entrepreneurship Education Inventory; Ewing Marion Kauffman Foundation; Paul J. Magelli
published: 2020-03-03
This second version (V2) provides additional data cleaning compared to V1, additional data collection (mainly to include data from 2019), and more metadata for nodes. Please see NETWORKv2README.txt for more detail.
keywords: citations; retraction; network analysis; Web of Science; Google Scholar; indirect citation
published: 2020-02-27
These data were collected for an experiment examining effects of neonicotinoid (clothianidin) presence on hover fly (Diptera: Syrphidae) behavior. Hover flies of two species (Eristalis arbustorum and Toxomerus marginatus) were offered a choice to feed on artificial flowers laced with sucrose solution that was either contaminated (CLO) or not contaminated (CON) with clothianidin. Two different concentrations of clothianidin in 0.5 M sucrose solution were tested: 2.5 ppb and 150 ppb. We conducted four sets of 10 trials, each trial set examining a different combination of species and clothianidin dose. Across 6 hours of video for each trial we recorded 1) number of visits to each flower that resulted in feeding, and 2) amount of time spent feeding during each visit. We found that while neither species fed significantly longer on either of the solutions, E. arbustorum appeared to avoid flowers with clothianidin particularly at high rates. In the paper, we attribute this avoidance response, partially, to hover fly-visible spectral differences between the two flower choices and discuss potential implications for field and lab-based studies. In the enclosed zip file we have included all data for this project and code scripts from R. * Note: Data folder contains 4 files (instead of 6 as mentioned in Readme): e.tenax_photoreceptors.csv; hoverfly_data_UPDATE.csv; number_visits_UPDATE.csv; and Original 2018 hover fly choice test data_Clem2020.xlsx
keywords: Syrphidae; hoverfly; Eristalis; Toxomerus; Choice Experiment; Neonicotinoid; Clothianidin