Displaying 76 - 100 of 380 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2018-05-06
 
This deposit contains all raw data and analysis from the paper "In-cell titration of small solutes controls protein stability and aggregation". Data is collected into several types: 1) analysis*.tar.gz are the analysis scripts and the resulting data for each cell. The numbers correspond to the numbers shown in Fig.S1. (in publication) 2) scripts.tar.gz contains helper scripts to create the dataset in bash format. 3) input.tar.gz contains headers and other information that is fed into bash scripts to create the dataset. 4) All rawData*.tar.gz are tarballs of the data of cells in different solutes in .mat files readable by matlab, as follows: - Each experiment included in the publication is represented by two matlab files: (1) a calibration jump under amber illumination (_calib.mat suffix) (2) a full jump under blue illumination (FRET data) - Each file contains the following fields:        coordleft - coordinates of cropped and aligned acceptor channel on the original image        coordright - coordinates of cropped and aligned donor channel on the original image]        dataleft - a 3d 12-bit integer matrix containing acceptor channel flourescence for each pixel and time step. Not available in _calib files        dataright - a 3d 12-bit integer matrix containing donor channel flourescence for each pixel and time step. This will be mCherry in _calib files and AcGFP in data files.        frame1 - original image size        imgstd - cropped dimensions        numFrames - number of frames in dataleft and dataright        videos - a structure file containing camera data. Specifically, videos.TimeStamp includes the time from each frame.
keywords: Live cell; FRET microscopy; osmotic challenge; intracellular titrations; protein dynamics
published: 2022-05-13
 
The files are plain text and contain the original data used in phylogenetic analyses of of Typhlocybinae (Bin, Dietrich, Yu, Meng, Dai and Yang 2022: Ecology & Evolution, in press). The three files with extension .phy are text files with aligned DNA sequences in the standard PHYLIP format and correspond to Matrix 1 (amino acid alignment), Matrix 2 (nucleotide alignment of first two codon positions of protein-coding genes) and Matrix 3 (nucleotide alignment of protein-coding genes plus 2 ribosomal genes) described in the Methods section. An additional text file in NEXUS format (.nex extension) contains the morphological character data used in the ancestral state reconstruction (ASCR) analysis described in the Methods. NEXUS is a standard format used by various phylogenetic analysis software. For more information on data file content, see the included "readme" files.
keywords: Hemiptera; phylogeny; mitochondrial genome; morphology; leafhopper
published: 2020-06-03
 
This dataset provides files for use in analysis of human land preference across Australasia, and in a localized analysis of land preference in Laos and Vietnam. All files can be imported into ArcGIS for visualization, and re-analyzed using the open source Maxent species distribution modeling program. CSV files contain known human presence sites for model validation. ASC files contain geographically coded environmental data for mean annual temperature and mean annual precipitation during the Last Glacial Maximum, as well as downward slope data. All ASC files are in the WGS 1984 Mercator map projection for visualization in ArcGIS and can be opened as text files in text editors supporting large file sizes.
keywords: human dispersal; ecological niche modeling; Australasia; Late Pleistocene; land preference
published: 2019-08-13
 
Multiple sequence alignments from concatenated nuclear and mitochondrial genes and resulting phylogenetic tree files of fruit doves and their close relatives. Files include: BEAST input XML file (fruit_dove_beast_input.xml); a maximum clade credibility tree from a BEAST analysis (fruit_dove_beast_mcc.tre); concatenated multiple sequence alignment NEXUS files for the novel dataset (fruit_dove_concatenated_alignment.nex, 76 taxa, 4,277 characters) and the dataset with additional sequences (fruit_dove_plus_cibois_data_concatenated_alignment.nex, 204 taxa, 4,277 characters), both of which contain a MrBayes block including partition information; and 50% majority-rule consensus trees generated from MrBayes analyses, using the NEXUS alignment files as inputs (fruit_dove_mrbayes_consensus.tre, fruit_dove_plus_cibois_data_mrbayes_consensus.tre).
keywords: fruit doves; multiple sequence alignment; phylogeny; Aves: Columbidae
published: 2020-06-02
 
The text file contains the original data used in the phylogenetic analyses of Xue et al. (2020: Systematic Entomology, in press). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 89 taxa (species) and 2676 characters, indicate that the first 2590 characters are DNA sequence and the last 86 are morphological, that gaps inserted into the DNA sequence alignment and inapplicable morphological characters are indicated by a dash, and that missing data are indicated by a question mark. The file contains aligned nucleotide sequence data for 5 gene regions and 86 morphological characters. The positions of data partitions are indicated in the mrbayes block of commands for the phylogenetic program MrBayes at the end of the file (Subset1 = 16S gene; Subset2 = 28S gene; Subset3 = COI gene; Subset 4 = Histone H3 and H2A genes). The mrbayes block also contains instructions for MrBayes on various non-default settings for that program. These are explained in the original publication. Descriptions of the morphological characters and more details on the species and specimens included in the dataset are provided in the supplementary document included as a separate pdf, also available from the journal website. The original raw DNA sequence data are available from NCBI GenBank under the accession numbers indicated in the supplementary file.
keywords: phylogeny; DNA sequence; morphology; Insecta; Hemiptera; Cicadellidae; leafhopper; evolution; 28S rDNA; 16S rDNA; histone H3; histone H2A; cytochrome oxidase I; Bayesian analysis
published: 2021-10-27
 
Shared dataset consists of 16S sequencing data of microbial communities. Each community is composed of heterotrophic bacteria derived from one of two soil samples and the model algae Chlamydomonas reinhardtii. Each comunity was placed in a materially closed environment with an initial supply of carbon in the media and subjected to light-dark cycles. The closed microbial ecosystems (CES) survived via carbon cycling. Each CES was subjected to rounds of dilution, after which the community was sequenced (data provided here). The shared dataset allowed us to conclude that CES consistently self-assembled to cycle carbon (data not provided) via conserved metabolic capabilites (data not provided) dispite differences in taxonomic composition (data provided). --------------------------- Naming convention: [soil sample = A or B][CES replicate = 1,2,3, or 4]_[round number = 1,2,3,or 4]_[reverse read = R or forward read = F]_filt.fastq Example -- A1_r1_F_filt.fastq means soil sample A, CES replicate 1, end of round1, forward read
keywords: 16S seq; .fastq; closed microbial ecosystems; carbon cycling
published: 2018-07-29
 
This repository includes scripts, datasets, and supplementary materials for the study, "NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge. ***When downloading datasets, please note that the following errors.*** In README.txt, lines 37 and 38 should read: + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre Note that the file names (fasttree-exon.tre and fasttree-intron.tre) are swapped. In tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the "symmetric difference error rate" as the "Robinson-Foulds error rate". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative. In njmerge-supplementary-materials.pdf, the alpha parameter shown in Supplementary Table S2 is actually the divisor D, which is used to compute alpha for each gene as follows. 1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution. 2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2). Note that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.
keywords: phylogenomics; species trees; incomplete lineage sorting; divide-and-conquer
published: 2023-03-08
 
A stochastic domination analysis model was developed to examine the effect that emerging carbon markets can have on the spatially varying returns and risk profiles of bioenergy crops relative to conventional crops. The code is written in MATLAB, and includes the calculated output. See the README file for instructions to run the code.
keywords: bioenergy crops; economic modeling; stochastic domination analysis model;
published: 2019-03-25
 
This dataset contains genotypic and phenotypic data, R scripts, and the results of analysis pertaining to a multi-location field trial of Miscanthus sinensis. Genome-wide association and genomic prediction were performed for biomass yield and 14 yield-component traits across six field trial locations in Asia and North America, using 46,177 single-nucleotide polymorphism (SNP) markers mined from restriction site-associated DNA sequencing (RAD-seq) and 568 M. sinensis accessions. Genomic regions and candidate genes were identified that can be used for breeding improved varieties of M. sinensis, which in turn will be used to generate new M. xgiganteus clones for biomass.
keywords: miscanthus; genotyping-by-sequencing (GBS); genome-wide association studies (GWAS); genomic selection
published: 2024-02-15
 
Dataset includes the dataset for estimating bat density from acoustic data and the R code. The data support a publication by Meredith L. Hoggatt, Clarissa A. Starbuck, and Joy M. O'Keefe entitled Acoustic monitoring yields informative bat population density estimates.
keywords: acoustics; bats; monitoring; population density; random encounter model
published: 2022-02-11
 
Upon treatment removal, spontaneous and random reactivation of latently infected T cells remains a major barrier toward curing HIV. Due to its stochastic nature, fluctuations in gene expression (or “noise”) can bias HIV reactivation from latency, and conventional drug screens for mean gene expression neglect compounds that modulate noise. Here we present a time-lapse fluorescence microscopy image set obtained from a Jurkat T-cell line, infected with a minimal HIV gene circuit, treated with 1,806 small molecule compounds, and imaged for 48 hours. In addition, the single-cell time-dependent reporter dynamics (single-cell gene expression intensity and noise trajectories) extracted from the image dataset are included. Based on this dataset, a total of 5 latency promoting agents of HIV was found through further experimentation in Lu et al., PNAS 2021 (doi: 10.1073/pnas.2012191118). For a detailed description of the dataset, please refer to the readme file.
keywords: HIV; latency; drug screen; fluorescence microscopy; time-lapse; microscopy; single-cell data; noise; gene expression fluctuation;
published: 2022-03-11
 
Data sets relating to the manuscript “Long-term yields in annual and perennial bioenergy crops in the Midwestern USA” published in Global Change Biology Bioenergy. Field data, including annual peak biomass and harvest yields from maize/soy, miscanthus, switchgrass, and prairie field trials from 2008-2018 are included. Peak and harvest biomass for fertilized and unfertilized miscanthus are included from 2014-2018.
keywords: miscanthus; switchgrass; yield; drought; crop; perennial; bioenergy
published: 2024-04-05
 
The following files include specimen information, DNA sequence data, and additional information on the analyses used to reconstruct the phylogeny of the leafhopper genus Neoaliturus as described in the Methods section of the original paper: 1. Taxon_sampling.csv: contains data on the individual specimens from which DNA was extracted, including sample code, taxon name, collection data (locality, date and name of collector) and museum unique identifier. 2. Alignments.zip: a ZIP archive containing 432 separate FASTA files representing the aligned nucleotide sequences of individual gene loci used in the analysis. 3. Concatenated_Matrix.fa: is a FASTA file containing the concatenated individual gene alignments used for the maximum likelihood analysis in IQ-TREE. 4. Genes_and_Loci.rtf: identifies the individual genes and loci used in the analysis. The partition name is the same as the name of the individual alignment file in the zipped Alignments folder. 5. Partitions_best_scheme.nex: is a text file in the standard NEXUS format that indicates the names of the individual data partitions and their locations in the concatenated matrix, and also indicates the substitution model for each partition. 6. (New in this version 2) Scripts & Description.zip includes 8 custom shell or perl scripts used to assemble the DNA sequence data by perform reciprocal blast searches between the reference sequences and assemblies for each sample, extract the best sequences based on the blast searches, screen the hits for each locus and keep only the best result, and generate the nucleotide sequence dataset for the predicted orthologues (see the file description.txt for details). 7. (New in this version 2) Full_genetic_distances_matrix.csv shows the genetic distances between pairs of samples in the datset (proportion of nucleotides that differ between samples).
keywords: leafhopper; phylogeny; anchored-hybrid-enrichment; DNA sequence; insect
published: 2021-09-17
 
We studied vegetation metric robustness to environmental (season, interannual, and regional) and methodological (observer) variables, as well as adequate sample size for vegetation metrics across four regions of the United States.
keywords: coefficients of conservatism; floristic quality assessment; restoration; vegetation metric;
published: 2022-03-31
 
This dataset contains our bi-hourly temperature recordings from 40 rocket box style artificial roosts of 5 designs deployed in Indiana and Kentucky, USA from April through September 2019. This dataset also includes our endothermic and faculatively heterothermic daily energy expenditure datasets used in our bioenergetic analysis, which were calculated from the bi-hourly rocket box temperature data. Lastly, we include our overheating counts dataset which summarizes daily overheating events (i.e., temperatures > 40 Celsius) in each rocket box style bat box over the course of the study period, these daily summaries were also calculated from the bi-hourly rocket box temperature recordings.
keywords: artificial roost; bat box; microcllimate; temperature
published: 2023-04-02
 
Use of cellulosic biofuels from non-feedstocks are modeled using the BEPAM (Biofuel and Environmental Policy Analysis Model) model to quantifying the uncertainties about induced land use change effects, net greenhouse gas saving potential, and economic costs. The code is in GAMS, general algebraic modeling language. NOTE: Column 3 is titled "BAU" in "merged_BAU.gdx", "merged_RFS.gdx", and "merged_CEM.gdx", but contains "RFS" data in "merged_RFS.gdx" and "CEM" data in "merged_CEM.gdx".
keywords: cellulosic biomass; BEPAM; economic modeling
published: 2021-07-15
 
The dataset contains the high-throughput matrix-assisted laser desorption/ionization mass spectrometry XmL files for the atrial gland and red hemiduct of Aplysia californica.
keywords: Dense-core vesicle; High-throughput; Mass Spectrometry; MALDI; Organelle; Image-Guided; Atrial gland; red hemiduct; Lucent Vesicle
published: 2021-04-06
 
These datasets contain modeling files and GIS data associated with a risk assessment study for the Cambrian-Ordovician sandstone aquifer system in Illinois from predevelopment (1863) to the year 2070. Modeling work was completed using the Illinois Groundwater Flow Model, a regional MODFLOW model developed for water supply planning in Illinois, as a base model. The model is run using the graphical user interface Groundwater Vistas 7.0. The development and technical details of the base Illinois Groundwater Flow Model, including hydraulic property zonation, boundary conditions, hydrostratigraphy, solver settings, and discretization, are described in Abrams et al. (2018). Modifications to this base model (the version presented here) are described in Mannix et al. (2018), Hadley et al. (2020) and Abrams and Cullen (2020). Modifications include removal of particular multi-aquifer wells to improve calibration, changing Sandwich Fault Zone properties to achieve calibration at production wells within and near the fault zone, and the incorporation of demand scenarios based on a participatory modeling project with the Southwest Water Planning Group. The zipped folder of model files contains MODFLOW input (package) files, Groundwater Vistas files, and a head file for the entire model run. The zipped folder of GIS data contains rasters of: simulated drawdown in the St. Peter sandstone from predevelopment to 2018, simulated drawdown in the Ironton-Galesville sandstone from predevelopment to 2018, simulated head difference between the St. Peter and Ironton-Galesville sandstone units in 2018, simulated head above the top of the St. Peter sandstone for the years 2029, 2050, and 2070, and simulated head above the top of the Ironton-Galesville sandstone for the years 2029, 2050, and 2070. Raster outputs were derived directly from the simulated heads in the Illinois Groundwater Flow Model. Rasters are clipped to the 8 county northeastern Illinois region (Cook, DuPage, Grundy, Kane, Kendall, Lake, McHenry, and Will counties). Well names, historic and current head targets, and spatial offsets for the Illinois Groundwater Flow Model are available upon request via a data license agreement. Please contact authors to set this up if needed.
keywords: groundwater; aquifer; sandstone aquifer; risk assessment; depletion; Illinois; MODFLOW; modeling
published: 2023-08-11
 
This dataset contains leaf photosynthetic and biochemical traits, plant biomass, and yield in five C3 crops (chickpea, rice, snap bean, soybean, wheat) and four C4 crops (sorghum, maize, Miscanthus × giganteus, switchgrass) grown under ambient and elevated O3 concentration ([O3]) in the field at free-air O3 concentration enrichment (O3-FACE) facilities over the past 20 years.
keywords: C3 and C4 crops; elevated O3; FACE; photosynthesis; yield
published: 2021-02-25
 
Total nitrogen leaching rates were calculated over the Mississippi Atchafalaya River Basin (MARB) using an integrated economic-biophysical modeling approach. Land allocation for corn production and total nitrogen application rates were calculated for crop reporting districts using the Biofuel and Environmental Policy Analysis Model (BEPAM) for 5 RFS2 policy scenarios. These were used as input in the Integrated BIosphere Simulator-Agricultural Version (Agro-IBIS) and the Terrestrial Hydrologic Model with Biogeochemistry (THMB) to calculate the nitrogen loss. Land allocation and total nitrogen application simulations were simulated for the period 2016-2030 for 303 crop reporting districts (https://www.nass.usda.gov/Data_and_Statistics/County_Data_Files/Frequently_Asked_Questions/county_list.txt). The final 2030 values are reported here. Both are stored in csv files. Units for land allocation are million ha and nitrogen application are million kg. The nitrogen leaching rates were modeled with a spatial resolution of 5' x 5' using the North American Datum of 1983 projection and stored in NetCDF files. The 30-year average is calculated over the last 30 years of the 45 years being simulated. Leaching rates are calculated in kg-N/ha.
keywords: nitrogen leaching, bioethanol, bioenergy crops
published: 2019-07-11
 
We studied the effect of windstorm disturbance on forest invasive plants in southern Illinois. This data includes raw data on plant abundance at survey points, compiled data used in statistical analyses, and spatial data for surveyed plots and units. This file package also includes a readme.doc file that describes the data in detail, including attribute descriptions.
keywords: tornado, blowdowns, derecho, invasive plants, Shawnee National Forest, southern Illinois
published: 2022-03-30
 
This dataset is associated with a larger manuscript published in 2022 in the Illinois Natural History Survey Bulletin to summarize all known records for nonindigenous aquatic mollusks in Illinois, and full sources are referenced within the manuscript. We examined museum holdings, literature accounts, publicly available databases sponsored by the U.S. Geological Survey (USGS) - Nonindigenous Aquatic Species program (http://nas.er.usgs.gov/.) and InvertEBase (invertebase.org). We also included sporadic field survey data of encounters of nonindigenous aquatic species from colleagues within the Illinois Natural History Survey, Illinois Department of Natural Resources, U.S. Fish and Wildlife Service, county forest preserve districts, and other natural resource agencies about their encounters with nonindigenous aquatic mollusk species. Lastly, we examined the role and utility of citizen-science data to document occurrences of nonindigenous aquatic mollusk species. We queried iNaturalist (www.inaturalist.org) for all available nonindigenous freshwater mollusk data for Illinois. Table heading descriptions (if not intuitive) are: “INHS verified” is whether an INHS staff member verified the record by observing vouchered specimen or photograph; “Source” is where a record was accessed or obtained; “individualCount” is number collected or observed in a record; “MuseumCode” is standard museum abbreviation or acronym; “Institution” is source that housed or reported a record, and this also includes the spelled-out museum code; “Collectors” typically indicates who collected the specimen or voucher; “Lat_Long determined by” denotes whether collection coordinates were stated by the collector or by a curator (using inference from data available); “fieldNumber” typically indicates a unique field number that a collector may have used in the field; “identifiedBy” typically explains who identified a specimen or verified a specimen identification.
keywords: Illinois; Exotic species; Non-native aquatic species; NAS; Aquatic Invasive Species; AIS; Mollusk
published: 2023-12-19
 
Data for the Appendices of Bush et al. article published in Ecology and Evolution. Contains genomic analysis information for a strain of Aspergillus flavus isolated from bee bread in East Central Illinois.
keywords: Excel; UIUC; Evolution and Ecology; Aspergillus flavus; genome
published: 2022-05-16
 
This dataset is for the publication "Do Nearctic hover flies (Diptera: Syrphidae) engage in long-distance migration? An assessment of evidence and mechanisms." It consists of 11 Excel spreadsheets and 4 R scripts which correspond to the analyses which were conducted. Paper abstract: Long-distance insect migration is poorly understood despite its tremendous ecological and economic importance. As a group, Nearctic hover flies (Diptera: Syrphidae: Syrphinae), which are crucial pollinators as adults and biological control agents as larvae, are almost entirely unrecognized as migratory despite examples of highly migratory behavior among several Palearctic species. Here, we examined evidence and mechanisms of migration for four hover fly species (Allograpta obliqua, Eupeodes americanus, Syrphus rectus, and Syrphus ribesii) common throughout eastern North America using stable hydrogen isotope (δ2H) measurements of chitinous tissue, morphological assessments, abundance estimations, and cold-tolerance assays. While further studies are needed, non-local isotopic values obtained from hover fly specimens collected in central Illinois support the existence of long-distance fall migratory behavior in Eu. americanus, and to a lesser extent S. ribesii and S. rectus. Elevated abundance of Eu. americanus during the expected autumn migratory period further supports the existence of such behavior. Moreover, high phenotypic plasticity of morphology associated with dispersal coupled with significant differences between local and non-local specimens suggest that Eu. americanus exhibits a unique suite of morphological traits that decrease costs associated with long-distance flight. Finally, compared to the ostensibly non-migratory A. obliqua, Eu. americanus was less cold tolerant, a factor that may be associated with migratory behavior. Collectively, our findings imply that fall migration occurs in Nearctic hover flies, but we consider methodological limitations of our study in addition to potential ecological and economic consequences of these novel findings.
keywords: Insect migration; hover fly; Syrphidae; stable isotopes; deuterium; morphometrics; cold tolerance