Displaying datasets 51 - 75 of 478 in total

Subject Area

Life Sciences (254)
Social Sciences (114)
Physical Sciences (68)
Technology and Engineering (38)
Uncategorized (3)
Arts and Humanities (1)

Funder

U.S. National Science Foundation (NSF) (139)
Other (129)
U.S. National Institutes of Health (NIH) (49)
U.S. Department of Energy (DOE) (42)
U.S. Department of Agriculture (USDA) (23)
Illinois Department of Natural Resources (IDNR) (10)
U.S. National Aeronautics and Space Administration (NASA) (5)
U.S. Geological Survey (USGS) (5)
Illinois Department of Transportation (IDOT) (1)
U.S. Army (1)

Publication Year

2021 (109)
2020 (96)
2022 (76)
2019 (72)
2018 (59)
2017 (35)
2016 (30)
2023 (1)

License

CC0 (280)
CC BY (186)
custom (12)
published: 2022-04-19
 
List of differentially expressed genes in human endometrial stromal cells with knockdown of Basigin (BSG) gene expression during decidualization. The BSG siRNA or negative scrambled control siRNA were transfected into human endometrial stromal cells (HESCs) following the protocol of siLentFect™ Lipid (Bio-Rad, Hercules, CA. Following complete knock down of BSG in HESCs (72 hours after adding siRNA), HESCs were treated with medium containing estrogen, progesterone and cAMP to induce decidualization. BSG siRNA and negative control scrambled siRNA were added to the cells every four days (day 0, 4) over the course of the decidualization protocol. Total RNA was harvested at day 6 of the decidualization protocol for microarray analysis. Microarray analysis was performed at the University of Illinois at Urbana-Champaign Roy J. Carver Biotechnology Center. Briefly, 0.2 micrograms of total RNA were labeled using the Agilent two color QuickAmp labeling kit (Agilent Technologies, Santa Clara, CA) according to the manufacturer’s protocol. The optional spike-in controls were not used. Samples were hybridized to Human Gene Expression 4x44K v2 Microarray (Agilent Technologies, Santa Clara, CA) in an Agilent Hybridization Cassette according to standard protocols. The arrays were then scanned on an Axon GenePix 4000B scanner and the images were quantified using Axon GenePix 6.1. Microarray data pre-processing and statistical analyses were done in R (v3.6.2) using the limma package (3.42.0 (Ritchie et al., 2015). Median foreground and median background values from the 4 arrays were read into R and any spots that had been manually flagged (-100 values) were given a weight of zero. The background values were ignored because investigations showed that trying to use them to adjust for background fluorescence added more noise to the data; background was low and even for all arrays, therefore no background correction was done. The individual Cy5 and Cy3 fluorescence for each array were normalized together using the quantile method 3 (Yang and Thorne, 2003). Agilent's Human Gene Expression 4x44K v2 Microarray has a total of 45,220 probes: 1224 probes for positive controls, 153 negative control, 823 labeled “ignore” and 43,118 labeled “cDNA”. The pos+neg+ignore probes were used to ascertain the background level of fluorescence (6, on the log2 scale) then discarded. The cDNA probes comprise 34,127 unique 60mer probes, of which 999 probes are spotted 10 times each and the rest one time each. We averaged the replicate probes for those spotted 10 times and then fit a mixed model that had treatment and dye as fixed effects and array pairing as a random effect (Phipson et al., 2016; Smyth et al., 2005). After fitting the model but before False Discovery Rate (FDR) correction (Benjamini and Hochberg, 1995), probes were filtered out by the following criteria: 1) did not have at least 4/8 samples with expression values > 6 (14,105 probes removed), 2) no longer had an assigned Entrez Gene ID in Bioconductor’s HsAgilentDesign026652.db annotation package (v3.2.3; 2,152 probes removed) (Huber et al., 2015), 3) mapped to the same Entrez Gene ID as another probe but had a larger p-value for treatment effect (4,141 probes removed). This left 13,729 probes representing 13,729 unique genes. <b>*Please note: that there is a discrepancy between the file and the readme as this plain text is the actual data file of this dataset.</b>
keywords: Basigin; endometrium; decidualization; human
published: 2022-04-11
 
This data set contains all the map data used for "Quantifying transportation energy vulnerability and its spatial patterns in the United States". The multiple dimensions (i.e., exposure, sensitivity, adaptive capacity) of transportation energy vulnerability (TEV) at the census tract level in the United States, the changes in TEV with electric vehicles adoption, and the detailed data for Chicago, Los Angeles, and New York are in the dataset.
keywords: Transport energy; Vulnerability; Fuel costs; Electric vehicles
published: 2022-03-30
 
This dataset is associated with a larger manuscript published in 2022 in the Illinois Natural History Survey Bulletin to summarize all known records for nonindigenous aquatic mollusks in Illinois, and full sources are referenced within the manuscript. We examined museum holdings, literature accounts, publicly available databases sponsored by the U.S. Geological Survey (USGS) - Nonindigenous Aquatic Species program (http://nas.er.usgs.gov/.) and InvertEBase (invertebase.org). We also included sporadic field survey data of encounters of nonindigenous aquatic species from colleagues within the Illinois Natural History Survey, Illinois Department of Natural Resources, U.S. Fish and Wildlife Service, county forest preserve districts, and other natural resource agencies about their encounters with nonindigenous aquatic mollusk species. Lastly, we examined the role and utility of citizen-science data to document occurrences of nonindigenous aquatic mollusk species. We queried iNaturalist (www.inaturalist.org) for all available nonindigenous freshwater mollusk data for Illinois. Table heading descriptions (if not intuitive) are: “INHS verified” is whether an INHS staff member verified the record by observing vouchered specimen or photograph; “Source” is where a record was accessed or obtained; “individualCount” is number collected or observed in a record; “MuseumCode” is standard museum abbreviation or acronym; “Institution” is source that housed or reported a record, and this also includes the spelled-out museum code; “Collectors” typically indicates who collected the specimen or voucher; “Lat_Long determined by” denotes whether collection coordinates were stated by the collector or by a curator (using inference from data available); “fieldNumber” typically indicates a unique field number that a collector may have used in the field; “identifiedBy” typically explains who identified a specimen or verified a specimen identification.
keywords: Illinois; Exotic species; Non-native aquatic species; NAS; Aquatic Invasive Species; AIS; Mollusk
published: 2022-03-25
 
Ground based radar data sets collected during the 2013 NASA EVEX Campaign conducted in Roi-Namur island of the Kwajalein Atoll in the Republic of Marshall Islands are deposited in this databank. Radar data were collected with IRIS VHF and ALTAIR VHF/UHF systems.
planned publication date: 2023-01-01
 
The following files were used to reconstruct the phylogeny of the leafhopper subfamily Typhlocybinae, using IQ-TREE v1.6.12 and ASTRAL v 4.10.5. <b>1) Taxon_sampling.csv:</b> contains the sample IDs (1st column) and the taxonomic information (2nd column). Sample IDs were used in the alignment files and partition files. <b>2) concatenated_nt_complete.phy:</b> a complete concatenated nucleotide dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12. The file lists the sequences of 248 samples with 154,992 nucleotide positions (intron included) from 665 loci. Hyphens are used to represent gaps. <b>3) concatenated_nt_complete_partition.nex:</b> the partitioning schemes for concatenated_nt_complete.phy. The file partitions the 154,992 nucleotide characters into 426 character sets, and defines the best substitution model for each character set. <b>4) concatenated_cds_complete.phy:</b> a complete concatenated coding DNA sequence dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12. The file lists the sequences of 248 samples with 153,525 nucleotide positions (intron excluded) from 665 loci. Hyphens are used to represent gaps. <b>5) concatenated_cds_complete_partition.nex:</b> the partitioning schemes for concatenated_cds_complete.phy. The file partitions the 153,525 nucleotide characters into 426 character sets, and defines the best substitution model for each character set. <b>6) concatenated_nt_reduced.phy:</b> a reduced concatenated nucleotide dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12. The file lists the sequences of 248 samples with 95,076 nucleotide positions (intron included) from 374 loci. Hyphens are used to represent gaps. <b>7) concatenated_nt_reduced_partition.nex:</b> the partitioning schemes for concatenated_nt_reduced.phy. The file partitions the 95,076 nucleotide characters into 312 character sets, and defines the best substitution model for each character set. <b>8) concatenated_aa_complete.phy:</b> a complete concatenated amino acid dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12, corresponding to concatenated_cds_complete.phy. The file lists the sequences of 248 samples with 51,175 amino acid positions from 665 loci. Hyphens are used to represent gaps. <b>9) concatenated_aa_complete_partition.nex:</b> the partitioning schemes for concatenated_aa_complete.phy. The file partitions the 51,175 amino acid characters into 426 character sets, and defines the best substitution model for each character set. <b>10) concatenated_aa_reduced.phy:</b> a reduced concatenated amino acid dataset used for the maximum likelihood analysis by IQ-TREE v1.6.12, corresponding to concatenated_nt_reduced.phy. The file lists the sequences of 248 samples with 31,384 amino acid positions from 374 loci. Hyphens are used to represent gaps. <b>11) concatenated_aa_reduced_partition.nex:</b> the partitioning schemes for concatenated_aa_reduced.phy. The file partitions the 31,384 amino acid characters into 312 character sets, and defines the best substitution model for each character set. <b>12) Individual_gene_alignment.zip:</b> contains 426 FASTA files, each one is an alignment for a gene. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12, followed by multispecies coalescent analysis using ASTRAL v 4.10.5 based the consensus trees with a minimum average bootstrap value of 70.
keywords: Auchenorrhyncha, Cicadomorpha, Membracoidea, anchored hybrid enrichment
published: 2022-03-25
 
The data are original electron micrographs from the lab of the late Dr. Burt Endo of the USDA. These data were digitized from photographic prints and glass plate negatives at 600 DPI as 16 bit TIFF files. This third version added 12 new ZIP files from the Endo data collection. "Endo folder database.xlsx" is updated to reflect the addition. Information in "ReadmeFile name formatting.docx" remains the same as in V2.
keywords: nematode; ultrastructure; Endo; electron microscopy
published: 2022-03-23
 
This dataset is a estimation of county-to-county commodity delivery through cold chain in 2017. For each county pair, the weight[kg] and value[$] of the cold chain flow between origin and destination for SCTG 5 and SCTG 7 commodities are estimated by our model. - SCTG 5 - Meat, poultry, fish, seafood, and their preparations - SCTG 7 - Other prepared foodstuffs, fats, and oils
keywords: food flows; cold chain; county-scale; United States; carbon footprint
published: 2022-03-19
 
Raw arthroscopic scores, histologic scores, cytokine measurements, and performance data for the study cohort described in the accompanying publication.
keywords: horse; metatarsophalangeal joint; arthroscopy; exercise; developmental orthopedic disease
published: 2022-03-11
 
Data sets relating to the manuscript “Long-term yields in annual and perennial bioenergy crops in the Midwestern USA” published in Global Change Biology Bioenergy. Field data, including annual peak biomass and harvest yields from maize/soy, miscanthus, switchgrass, and prairie field trials from 2008-2018 are included. Peak and harvest biomass for fertilized and unfertilized miscanthus are included from 2014-2018.
keywords: miscanthus; switchgrass; yield; drought; crop; perennial; bioenergy
published: 2022-03-01
 
The following files were used to reconstruct the phylogeny of the leafhopper subfamily Deltocephalinae, using IQ-TREE v1.6.12 and ASTRAL v 4.10.5. <b>1) taxon_sampling.csv:</b> contains the sequencing ids (1st column) and the taxonomic information (2nd column) of each sample. Sequencing ids were used in the alignment files and partition files. <b>2)concatenated_nt.phy:</b> concatenated nucleotide alignment used for the maximum likelihood analysis of Deltocephalinae by IQ-TREE v1.6.12. The file lists the sequences of 163,365 nucleotide positions from 429 genes in 730 samples. Hyphens are used to represent gaps. <b>3) concatenated_nt_partition.nex:</b> the partitions for the concatenated nucleotide alignment. The file partitions the 163,365 nucleotide characters into 429 character sets, and defines the best substitution model for each character set. <b>4) concatenated_aa.phy:</b> concatenated amino acid alignment used for the maximum likelihood analysis of Deltocephalinae by IQ-TREE v1.6.12. The file gives the sequences of 53,969 amino acids from 429 genes in 730 samples. Hyphens are used to represent gaps. <b>5) concatenated_aa_partition.nex:</b> the partitions for the concatenated amino acid alignment. The file partitions the 53,969 characters into 429 character sets, and defines the best substitution model for each character set. <b>6) concatenated_nt_106taxa.phy:</b> a reduced concatenated nucleotide alignment representing 107 samples x 86 genes. This alignment is used to estimate the divergence times of Deltocephalinae using MCMCTree in PAML v4.9. The file lists the sequences of 79,239 nucleotide positions from 86 genes in 107 samples. Hyphens are used to represent gaps. <b>7) concatenated_nt_106taxa_partition.nex:</b> the partitions for the nucleotide alignment concatenated_nt_106taxa.phy. The file partitions the 79,239 nucleotide characters into 86 character sets, and defines the best substitution model for each character set. <b>8) individual_gene_alignment.zip:</b> contains 429 FAS files, one for each of the partitioned nucleotide character sets in the concatenated_nt_partition.nex file. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12, followed by multispecies coalescent analysis using ASTRAL v 4.10.5.
published: 2022-02-20
 
This dataset contains the files used to perform the work savings and recall evaluation in the study titled "Data from Testing a filtering strategy for systematic reviews: Evaluating work savings and recall."
keywords: systematic reviews; machine learning; work savings; recall; search results filtering
published: 2022-02-14
 
Dataset associated with Allen et al. (In Review): Food caching by a solitary large carnivore supports optimal foraging theory If using this dataset, please cite this manuscript.
published: 2022-02-14
 
This dataset contains simulation results from numerical model PartMC-MOSAIC used in the article "Quantifying the effects of mixing state on aerosol optical properties". This article is submitted to the journal Atmospheric Physics and Chemistry. There are total 100 scenario directories in this dataset, denoted from 00-99. Each scenario contains 25 NetCDF files hourly output from PartMC-MOSAIC simulations containing the simulated gas and particle information. The data was produced using version 2.5.0 of PartMC-MOSAIC. Instructions to compile and run PartMC-MOSAIC are available at https://github.com/compdyn/partmc. The chemistry code MOSAIC is available by request from Rahul.Zaveri@pnl.gov. For more details of reproducing the cases, please contact nriemer@illinois.edu and yuyao3@illinois.edu.
keywords: Aerosol mixing state; Aerosol optical properties; Mie calculation; Black Carbon
published: 2022-02-11
 
The Culex_Trivellone_etal.fas fasta file contains the original final sequence alignment used in the haplotype analyses of Trivellone et al. (Frontiers in Public Health, under review). The 492 sequences (from specimens of Culex pipiens complex collected in different habitat types using a BG-sentinel traps) were aligned using PASTA v1.8.5 under default settings. The final dataset contains 686 positions of the cytochrome c oxidase subunit I (COI) mitochondrial gene. The data analyses are further described in the cited original paper.
keywords: Culex; Culicidae; COI; mosquito surveillance, species assemblages
published: 2022-02-11
 
Upon treatment removal, spontaneous and random reactivation of latently infected T cells remains a major barrier toward curing HIV. Due to its stochastic nature, fluctuations in gene expression (or “noise”) can bias HIV reactivation from latency, and conventional drug screens for mean gene expression neglect compounds that modulate noise. Here we present a time-lapse fluorescence microscopy image set obtained from a Jurkat T-cell line, infected with a minimal HIV gene circuit, treated with 1,806 small molecule compounds, and imaged for 48 hours. In addition, the single-cell time-dependent reporter dynamics (single-cell gene expression intensity and noise trajectories) extracted from the image dataset are included. Based on this dataset, a total of 5 latency promoting agents of HIV was found through further experimentation in Lu et al., PNAS 2021 (doi: 10.1073/pnas.2012191118). For a detailed description of the dataset, please refer to the readme file.
keywords: HIV; latency; drug screen; fluorescence microscopy; time-lapse; microscopy; single-cell data; noise; gene expression fluctuation;
published: 2022-02-11
 
The data contains a list of articles given low score by the RCT Tagger and an error analysis of them, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews". Change made in this V3 is that the data is divided into two parts: - Error Analysis of 44 Low Scoring Articles with MEDLINE RCT Publication Type. - Error Analysis of 244 Low Scoring Articles without MEDLINE RCT Publication Type.
keywords: Cochrane reviews; automation; randomized controlled trial; RCT; systematic reviews
published: 2022-02-10
 
The compiled datasets include plot level observations of energy crops (miscanthus and switchgrass) from recent experimental field trials in the US including dry biomass yield, location, state, region, harvest year, growing season degree days (GDD), winter season heating degree days (HDD), growing season cumulative precipitation, annual nitrogen application rate, age of the pant when harvested, National Commodity Crop Productivity Index (NCCPI) values, and cultivar type (switchgrass) from various published and unpublished sources. The stata codes include estimation procedures for four different specifications, i.e., Model A includes deterministic effect without interaction terms; Model B includes deterministic effect with interaction terms (N2, age2, N × age, GDD2, precip2, N × NCCPI); Model C includes deterministic effect with interaction terms, study, and location random effect; Model D includes deterministic effect with interaction terms, harvest year augmented study, and location random effect.
keywords: Age; Miscanthus; Nitrogen; Switchgrass; Yield; Center for Advanced Bioenergy and Bioproducts Innovation
published: 2022-02-09
 
The data file contains a list of articles with PMIDs information, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews".
keywords: Cochrane reviews; Randomized controlled trials; RCT; Automation; Systematic reviews
published: 2022-02-09
 
The data file contains a list of articles and their RCT Tagger prediction scores, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews".
keywords: Cochrane reviews; automation; randomized controlled trial; RCT; systematic reviews
published: 2021-11-19
 
This is a general description of the datasets included in this upload; details of each dataset can be found in the individual README.txt in each compressed folder. We have: 1. ROSE-HF.tar.gz 2. ROSE-LF.tar.gz HF (high fragmentary): 50% of the sequences are made fragmentary, which have average lengths of 25% of the original lengths with a standard deviation of 60 bp. LF (low fragmentary): 25% of the sequences are made fragmentary, which have average lengths of 50% of the original lengths with a standard deviation of 60 bp. The seven ROSE datasets made fragmentary are: 1000L1, 1000L3, 1000L4, 1000M3, 1000S1, 1000S2 and 1000S4. "ROSE-HF.tar.gz" contains HF versions of the seven ROSE datasets. "ROSE-LF.tar.gz" contains LF versions of the seven ROSE datasets.
keywords: ROSE; simulation; fragmentary
published: 2022-01-30
 
This dataset contains temperature measurements in four different bat box designs deployed in central Indiana, USA from May to September 2018. Hourly environmental data (temperature, solar radiation, and wind speed) are also included for days and hours sampled. Bat box temperature data were used as inputs in a free program, GNU Octave, to assess design performance with respect to suitability indices for endothermic metabolism and pup development. Scripts are included in the dataset.
keywords: bats;thermal refuge;reproduction;conservation;bat box;microclimate
published: 2022-02-07
 
This dataset provides estimates of agricultural and food commodity flows [kg] between all county pairs within the United States for the years 2007, 2012, and 2017. The database provides 206.3 million data points, since pairwise information is provided between 3134 counties, for 7 commodity categories, and 3 time periods. The commodity categories correspond to the Standardized Classification of Transported Goods and are: - SCTG 1: Iive animals and fish - SCTG 2: cereal grains - SCTG 3: agricultural products (except for animal feed, cereal grains, and forage products) - SCTG 4: animal feed, eggs, honey, and other products of animal origin - SCTG 5: meat, poultry, fish, seafood, and their preparations - SCTG 6: milled grain products and preparations, and bakery products - SCTG 7: other prepared foodstuffs, fats and oils For additional information, please see the related paper by Karakoc et al. (2022) in Environmental Research Letters.
keywords: food flows; high-resolution; county-scale; time-series; United States
has sharing link
 
published: 2022-01-31
 
This dataset contains results from WRF simulations over northern South America. The Orinoco Low-Level Jet (OLLJ) and the Cross-Equatorial Moisture Transport are important circulation structures of the climate of tropical South America. We explore the sensitivity of the OLLJ and cross-equatorial transport to the representation of surface fluxes and turbulence by using two different Land Surface Model (LSM) schemes (Noah and CLM) and three Planetary Boundary Layer (PBL) schemes (YSU, QNSE and MYNN).
keywords: WRF; Orinoco LLJ; preicpitation
published: 2022-01-27
 
Twenty-two genotypes of C4 species grown under ambient and elevated O3 concentration were studied at the SoyFACE (40°02’N, 88°14’W) in 2019. This dataset contains leaf morphology, photosynthesis and nutrient contents measured at three time points. The results of CO2 response curves are also included.
keywords: C4, O3, photosynthesis