Illinois Data Bank Dataset Search Results
Results
published:
2024-03-01
Chen, Chu-Chun; Dominguez, Francina
(2024)
This dataset contains model output from the Community Earth System Model, Version 1 (CESM1; Hurrell et al., 2013) and variables from the European Centre for Medium-Range Weather Forecast (ECMWF) Reanalysis v5 (ERA5; Hersbach et al., 2020). These data were used for analysis in “The location of large-scale soil moisture anomalies affects moisture transport and precipitation over southeastern South America”, published in Geophysical Research Letters.
Acknowledgments:
This work was supported by NSF Award AGS-1852709. We acknowledge high-performance computing support from Cheyenne (doi:10.5065/D6RX99HX) provided by NCAR's Computational and Information Systems Laboratory, sponsored by the NSF. We thank Dr. Haiyan Teng for providing guidance on setting up the CESM experiments and offering valuable advice.
References:
Hersbach H, Bell B, Berrisford P, et al. The ERA5 global reanalysis. Q J R Meteorol Soc. 2020; 146: 1999–2049. https://doi.org/10.1002/qj.3803
Hurrell, J. W., and Coauthors, 2013: The Community Earth System Model: A Framework for Collaborative Research. Bull. Amer. Meteor. Soc., 94, 1339–1360, https://doi.org/10.1175/BAMS-D-12-00121.1
keywords:
atmospheric sciences; climate modeling; land-atmosphere interactions; soil moisture; regional atmospheric circulation; southeastern South America
published:
2020-07-15
Legried, Brandon; Molloy, Erin K.; Warnow, Tandy; Roch, Sebastien
(2020)
This repository includes scripts and datasets for the paper, "Polynomial-Time Statistical Estimation of Species Trees under Gene Duplication and Loss."
keywords:
Species tree estimation; gene duplication and loss; identifiability; statistical consistency; quartets; ASTRAL
published:
2020-05-31
Zhang, Chuanyi; El-Kebir, Mohammed; Ochoa, Idoia
(2020)
This repository includes a simulated dataset and related scripts used for the paper "Moss: Accurate Single-Nucleotide Variant Calling from Multiple Bulk DNA Tumor Samples".
keywords:
Somatic Mutations; Bulk DNA Sequencing; Cancer Genomics
published:
2020-04-20
Supplemental data sets for the Manuscript entitled "Contribution of fungal and invertebrate communities to mass loss and wood depolymerization in tropical terrestrial and aquatic habitats"
keywords:
Coiba Island; wood decomposition; cellulose; hemicellulose; lignin breakdown; aquatic fungi
published:
2020-01-31
Bradshaw, Therin M.; Blake-Bradshaw, Abigail G.; Fournier, Auriel M.V.; Lancaster, Joseph D. ; O'Connell, John; Jacques, Christopher N.; Eicholtz, Michael W.; Hagy, Heath M
(2020)
Data inputs, and scripts for the analysis detailed in Bradshaw et al, published in PlosONE 2020.
keywords:
Marsh birds; wetlands
published:
2020-06-19
This dataset include data pulled from the World Bank 2009, the World Values Survey wave 6, Transparency International from 2009. The data were used to measure perceptions of expertise from individuals in nations that are recipients of development aid as measured by the World Bank.
keywords:
World Values Survey; World Bank; expertise; development
published:
2022-05-20
Haselhorst, Derek; Moreno, J. Enrique; Tcheng, David K.; Punyasena, Surangi W.
(2022)
This dataset includes images and annotated counts for 150 airborne pollen samples from the Center for Tropical Forest Science 50 ha forest dynamics plot on Barro Colorado Island, Panama. Samples were collected once a year from April 1994 to June 2010.
keywords:
aerial pollen traps; automated pollen identification; Barro Colorado Island; convolutional neural networks; Neotropics; palynology; phenology
published:
2011-09-20
Swenson, M. Shel; Suri, Rahul; Linder, C. Randal; Warnow, Tandy; Nguyen, Nam-puhong; Mirarab, Siavash; Neves, Diogo Telmo; Sobral, João Luís; Pingali, Keshav; Nelesen, Serita; Liu, Kevin; Wang, Li-San
(2011)
This page provides the data for SuperFine, DACTAL, and BeeTLe publications.
- Swenson, M. Shel, et al. "SuperFine: fast and accurate supertree estimation." Systematic biology 61.2 (2012): 214.
- Nguyen, Nam, Siavash Mirarab, and Tandy Warnow. "MRL and SuperFine+ MRL: new supertree methods." Algorithms for Molecular Biology 7 (2012): 1-13.
- Neves, Diogo Telmo, et al. "Parallelizing superfine." Proceedings of the 27th Annual ACM Symposium on Applied Computing. 2012.
- Nelesen, Serita, et al. "DACTAL: divide-and-conquer trees (almost) without alignments." Bioinformatics 28.12 (2012): i274-i282.
- Liu, Kevin, and Tandy Warnow. "Treelength optimization for phylogeny estimation." PLoS One 7.3 (2012): e33104.
published:
2019-12-20
Wang, Yu; Burgess, Steven J. ; de Becker, Elsa ; Long, Stephen P.
(2019)
This dynamic photosynthesis model of soybean canopy is developed by Yu Wang (yuwangcn@illinois.edu), IGB, University of Illinois.
If you want to know more details, please check the following publication
Yu Wang, Steven J. Burgess, Elsa de Becker, Stephen P. Long. Photosynthesis in the fleeting shadows: An overlooked opportunity for increasing crop productivity? The Plant Journal.
keywords:
Matlab; Soybean canopy; photosynthesis model
published:
2020-03-13
Sweet, Andrew; Johnson, Kevin; Cameron, Stephen
(2020)
Data files associated with the assembly of mitochondrial minicircles from five species of parasitic lice. This includes data from four species in the genus Columbicola and from the human louse (Pediculus humanus). The files include FASTA sequences for all five species, reference sequences for read mapping approaches, resulting contigs produced by various assembly approaches, and alignments of human louse minicircles mapped to published sequences of the same species.
keywords:
mitochondria; FASTA; nucleotide sequences; alignment; Columbicola; Pediculus
published:
2021-09-06
Airglow images and Meteor radar data used in the paper "Mesospheric gravity wave activity estimated via airglow imagery, multistatic meteor radar, and SABER data taken during the SIMONe–2018 campaign".
keywords:
airglow; meteor radar; gravity waves; momentum flux;
published:
2021-10-15
Jianhao, Peng; Idoia, Ochoa
(2021)
This is the 5 states 5000 cells synthetic expression file we used for validation of SimiC, a single cell gene regulatory network inference method with similarity constraints. Ground truth GRNs are stored in Numpy array format, and expression profiles of all states combined are stored in Pandas DataFrame in format of Pickle files.
keywords:
Numpy array; GRNs; Pandas DataFrame;
published:
2016-05-16
This dataset contains the protein sequences and trees used to compare Non-Ribosomal Peptide Synthetase (NRPS) condensation domains in the AMB gene cluster and was used to create figure S1 in Rojas et al. 2015. Instead of having to collect representative sequences independently, this set of condensation domain sequences may serve as a quick reference set for coarse classification of condensation domains.
keywords:
NRPS; biosynthetic gene cluster; antimetabolite; Pseudomonas; oxyvinylglycine; secondary metabolite; thiotemplate; toxin
published:
2019-09-17
Fraebel, David T.; Kuehn, Seppe
(2019)
BAM files for evolved strains from migration rate selection experiments conducted in low viscosity (0.2% w/v) agar plates containing M63 minimal medium with 1mM of mannose, melibiose, N-acetylglucosamine or galactose
published:
2018-06-20
Lao, Yuyang; Caravelli, Francesco; Sheikh, Mohammed; Sklenar, Joseph; Gardeazabal, Daniel; Watts, Justin D. ; Albrecht, Alan M. ; Scholl, Andreas; Dahmen, Karin; Nisoli, Cristiano; Schiffer, Peter
(2018)
The dataset includes the data used in the study of Classical Topological Order in the Kinetics of Artificial Spin Ice. This includes the photoemission electron microscopy intensity measurement of artificial spin ice at different temperatures as a function of time. The data includes the raw data, the metadata, and the data cookbook. Please refer to the data cookbook for more information. Note: vertex_population.xlsx file in the meta_data_code folder can be disregarded.
keywords:
artificial spin ice; PEEM; topological order
published:
2019-05-20
Lao, Yuyang; Schiffer, Peter
(2019)
This is the experimental data of tetris artificial spin ice. The islands are made of Permalloy materials with size of 170 nm by 470 nm by 2.5 nm. The systems are measured at a temperature where the islands are fluctuating around room temperature. The data is recorded as photoemission electron microscopy intensity. More details about the dataset can be found in the file Note.txt and Tetris_data_list.xlsx
Note:
2 files name bl11_teris600_033 and bl11_tetris600_2_135 are not recorded in the excel sheet because they are corrupted during the measurement. Any data that is not recorded in the excel sheet is either corrupted or of low quality.
From files *_028 to *_049, tetris is spelled with “t” while in the raw data folder without “t”. This is a typo. Throughout the dataset, tetris and teris are supposed to have the same meaning.
keywords:
artificial spin ice
published:
2019-07-04
Sashittal, Palash; El-Kebir, Mohammed
(2019)
Results generated using SharpTNI on data collected from the 2014 Ebola outbreak in Sierra Leone.
published:
2019-08-05
Skinner, Rachel; Dietrich, Christopher; Walden, Kimberly; Gordon, Eric; Sweet, Andrew; Podsiadlowski, Lars; Petersen, Malte; Simon, Chris; Takiya, Daniela; Johnson, Kevin
(2019)
The data in this directory corresponds to:
Skinner, R.K., Dietrich, C.H., Walden, K.K.O., Gordon, E., Sweet, A.D., Podsiadlowski, L., Petersen, M., Simon, C., Takiya, D.M., and Johnson, K.P.
Phylogenomics of Auchenorrhyncha (Insecta: Hemiptera) using Transcriptomes: Examining Controversial Relationships via Degeneracy Coding and Interrogation of Gene Conflict.
Systematic Entomology.
Correspondance should be directed to: Rachel K. Skinner, rskinn2@illinois.edu
If you use these data, please cite our paper in Systematic Entomology.
The following files can be found in this dataset:
Amino_acid_concatenated_alignment.phy: the amino acid alignment used in this analysis in phylip format.
Amino_acid_raxml_partitions.txt (for reference only): the partitions for the amino acid alignment, but a partitioned amino acid analysis was not performed in this study.
Amino_acid_concatenated_tree.newick: the best maximum likelihood tree with bootstrap values in newick format.
ASTRAL_input_gene_trees.tre: the concatenated gene tree input file for ASTRAL
README_pie_charts.md: explains the the scripts and data needed to recreate the pie charts figure from our paper. There is also another
Corresponds to the following files:
ASTRAL_species_tree_EN_only.newick: the species tree with only effective number (EN) annotation
ASTRAL_species_tree_pp1_only.newick: the species tree with only the posterior probability 1 (main topology) annotation
ASTRAL_species_tree_q1_only.newick: the species tree with only the quartet scores for the main topology (q1)
ASTRAL_species_tree_q2_only.newick: the species tree with only the quartet scores for the first alternative topology (q2)
ASTRAL_species_tree_q3_only.newick: the species tree with only the quartet scores for the second alternative topology (q3)
print_node_key_files.py: script needed to create the following files:
node_keys.key: text file with node IDs and topologies
complete_q_scores.key: text file with node IDs multiplied q scores
EN_node_vals.key: text file with node IDs and EN values
create_pie_charts_tree.py: script needed to visualize the tree with pie charts, pp1, and EN values plotted at nodes
ASTRAL_species_tree_full_annotation.newick: the species tree with full annotation from the ASTRAL analysis.
NOTE: It may be more useful to examine individual value files if you want to visualize the tree,
e.g., in figtree, since the full annotations are extensive and can make viewing difficult.
Complete_NT_concatenated_alignment.phy: the nucleotide alignment that includes unmodified third codon positions. The alignment is in phylip format.
Complete_NT_raxml_partitions.txt: the raxml-style partition file of the nucleotide partitions
Complete_NT_concatenated_tree.newick: the best maximum likelihood tree from the concatenated complete analysis NT with bootstrap values in newick format
Complete_NT_partitioned_tree.newick: the best maximum likelihood tree from the partitioned complete NT analysis with bootstrap values in newick format
Degeneracy_coded_nt_concatenated_alignment.phy: the degeneracy coded nucleotide alignment in phylip format
Degeneracy_coded_nt_raxml_partitions.txt: the raxml-style partition file for the degeneracy coded nucleotide alignment
Degeneracy_coded_nt_concatenated_tree.newick: the best maximum likelihood tree from the degeneracy-coded concatenated analysis with bootstrap values in newick format
Degeneracy_coded_nt_partitioned_tree.newick: the best maximum likelihood tree from the degeneracy-coded partitioned analysis with bootstrap values in newick format
count_ingroup_taxa.py: script that counts the number of ingroup and/or outgroup taxa present in an alignment
keywords:
Auchenorrhyncha; Hemiptera; alignment; trees
published:
2019-12-03
These are the alignments of transcriptome data used for the analysis of members of Heteroptera. This dataset is analyzed in "Deep instability in the phylogenetic backbone of Heteroptera is only partly overcome by transcriptome-based phylogenomics" published in Insect Systematics and Diversity.
keywords:
Heteroptera; Hemiptera; Phylogenomics; transcriptome
published:
2020-01-20
Zhang, Jun; Wuebbles, Donald; Kinnison, Douglas; Saiz López, Alfonso
(2020)
This datasets provide basis of our analysis in the paper - Revising the Ozone Depletion Potentials for Short-Lived Chemicals such as CF3I and CH3I. All datasets here are from the model output (CAM4-chem). All the simulations (background and perturbation) were run to steady-state and only the last year outputs used in analysis are archived here.
keywords:
Illinois Data Bank; NetCDF; Ozone Depletion Potential; CF3I and CH3I
published:
2020-11-05
Miller, Andrew; Raudabaugh, Daniel
(2020)
This version 2 dataset contains 34 files in total with one (1) additional file, called "Culture-dependent Isolate table with taxonomic determination and sequence data.csv". The remaining files (33) are identical to version 1. The following is the information about the new file and its variables:
<b>Culture-dependent Isolate table with taxonomic determination and sequence data.csv</b>: Culture table with assigned taxonomy from NCBI. Single direction sequence for each isolate is include if one could be obtained. Sequence is derived from ITS1F-ITS4 PCR amplicons, with Sanger sequencing in one direction using ITS5. The files contains 20 variables with explanation as below:
IsolateNumber : unique number identify each isolate cultured
Time: season in which the sample was collected
Location: the specific name of the location
Habitat: type of habitat : either stream or peatland
State: state in the USA in which the specific location is located
Incubation_pH ID: pH of the medium during isolation of fungal cultures
Genus: phylogenetic genus of the fungal isolates (determined by sequence similarity)
Sequence_quality: base call quality of the entire sequence used for blast analysis, if known
%_coverage: sequence coverage reported from GenBank
%_ID: sequence similarity reported from GenBank
Life_style : ecological life style if known
Phylum: phylogenetic phylum as indicated by Index Fungorum
Subphylum: phylogenetic subphylum as indicated by Index Fungorum
Class: phylogenetic class as indicated by Index Fungorum
Subclass: phylogenetic subclass as indicated by Index Fungorum
Order: phylogenetic order as indicated by Index Fungorum
Family: phylogenetic Family as indicated by Index Fungorum
ITS5_Sequence: single direction sequence used for sequence similarity match using blastn. Primer ITS5
Fasta: sequence with nomenclature in a fasta format for easy cut and paste into phylogenetic software
Note: blank cells mean no data is available or unknown.
keywords:
ITS1 forward reads; Illumina; peatlands; streams; bogs; fens
published:
2019-05-10
Pradhan, Dikshant; Jensen, Paul
(2019)
Data necessary for production of figures presented in "Efficient enzyme coupling algorithms identify functional pathways in genome-scale metabolic models" by Pradhan et al.
keywords:
Efficient enzyme coupling algorithms identify functional pathways in genome-scale metabolic models;
published:
2019-12-03
This is the data set associated with the manuscript titled "Extensive host-switching of avian feather lice following the Cretaceous-Paleogene mass extinction event." Included are the gene alignments used for phylogenetic analyses and the cophylogenetic input files.
keywords:
phylogenomics, cophylogenetics, feather lice, birds
published:
2012-07-01
Mirarab, Siavash; Ngyuen, Nam-Phuong; Warnow, Tandy
(2012)
This dataset provides the data for Mirarab, Siavash, Nam Nguyen, and Tandy Warnow. "SEPP: SATé-enabled phylogenetic placement." Biocomputing 2012. 2012. 247-258.
published:
2019-06-12
Miller, Andrew; Raudabaugh, Daniel
(2019)
The data set contains Supplemental data sets for the Manuscript entitled "Where are they hiding? Testing the body snatchers hypothesis in pyrophilous fungi."
Environmental sampling: Amplification of nuclear DNA regions (ITS1 and ITS2) were completed using the Fluidigm Access Array and the resulting amplicons were sequenced on an Illumina MiSeq v2 platform runs using rapid 2 × 250 nt paired-end reads. Illumina sequencing run amplicons that were size selected into <500nt and >500nt sub-pools, then remixed together <500nt: >500nt by nM concentration in a 1x:3x proportion. All amplification and sequencing steps were performed at the Roy J. Carver Biotechnology Center at the University of Illinois Urbana-Champaign.
ITS1 region primers consisted of ITS1F (5'-CTTGGTCATTTAGAGGAAGTAA-'3) and ITS2 (5'-GCTGCGTTCTTCATCGATGC-'3).
ITS2 region primers consisted of fITS7 (5'-GTGARTCATCGAATCTTTG-'3) and ITS4 (5'-TCCTCCGCTTATTGATATGC-'3).
Supplemental files 1 through 5 contain the raw data files.
Supplemental 1 is the ITS1 Illumina MiSeq forward reads and Supplemental 2 is the corresponding index files.
Supplemental 3 is the ITS2 Illumina MiSeq forward reads and Supplemental 4 is the corresponding index files.
Supplemental 5 is the map file needed to process the forward reads and index files in QIIME.
Supplemental 6 and 7 contain the resulting QIIME 1.9.1. OTU tables along with UNITE, NCBI, and CONSTAX taxonomic assignments in addition to the representative OTU sequence.
Numeric samples within the OTU tables correspond to the following:
1 Brachythecium sp.
2 Usnea cornuta
3 Dicranum sp.
4 Leucodon julaceus
5 Lobaria quercizans
6 Rhizomnium sp.
7 Dicranum sp.
8 Thuidium delicatulum
9 Myelochroa aurulenta
10 Atrichum angustatum
11 Dicranum sp.
12 Hypnum sp.
13 Atrichum angustatum
14 Hypnum sp.
15 Thuidium delicatulum
16 Leucobryum sp.
17 Polytrichum commune
18 Atrichum angustatum
19 Atrichum angustatum
20 Atrichum crispulum
21 Bryaceae
22 Leucobryum sp.
23 Conocephalum conicum
24 Climacium americanum
25 Atrichum angustatum
26 Huperzia serrata
27 Polytrichum commune
28 Diphasiastrum sp.
29 Anomodon attenuatus
30 Bryoandersonia sp.
31 Polytrichum commune
32 Thuidium delicatulum
33 Brachythecium sp.
34 Leucobryum glaucum
35 Bryoandersonia sp.
36 Anomodon attenuatus
37 Pohlia sp.
38 Cinclidium sp.
39 Hylocomium splendens
40 Polytrichum commune
41 negative control
42 Soil
43 Soil
44 Soil
45 Soil
46 Soil
47 Soil
If a sample number is not present within the OTU table; either no sequences were obtained or no sequences passed the quality filtering step in QIIME.
Supplemental 8 contains the Summary of unique species per location.