Illinois Data Bank
Log in with NetID
University Library, University of Illinois at Urbana-Champaign
Illinois Data Bank
Log in with NetID
25 per page
50 per page
Displaying datasets 51 - 75 of 397 in total
Generate Report from Search Results
Life Sciences (221)
Social Sciences (88)
Physical Sciences (53)
Technology and Engineering (34)
Arts and Humanities (1)
U.S. National Science Foundation (NSF) (104)
U.S. National Institutes of Health (NIH) (41)
U.S. Department of Energy (DOE) (35)
U.S. Department of Agriculture (USDA) (21)
Illinois Department of Natural Resources (IDNR) (9)
U.S. National Aeronautics and Space Administration (NASA) (4)
U.S. Geological Survey (USGS) (4)
U.S. Army (1)
CC BY (162)
Gupta, Maya; Zaharias, Paul; Warnow, Tandy (2021): Data from: Accurate Large-scale Phylogeny-Aware Alignment using BAli-Phy. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7863273_V1
This repository includes scripts and datasets for the paper, "Accurate Large-scale Phylogeny-Aware Alignment using BAli-Phy" submitted to Bioinformatics.
BAli-Phy;Bayesian co-estimation;multiple sequence alignment
Jackson, Nicole ; Konar, Megan ; Debaere, Peter; Sheffield, Justin (2021): Data for "Crop-specific exposure to extreme temperature and moisture for the globe for the last half century". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5457902_V1
Global assessments of climate extremes typically do not account for the unique characteristics of individual crops. A consistent definition of the exposure of specific crops to extreme weather would enable agriculturally-relevant hazard quantification. We introduce the Agriculturally-Relevant Exposure to Shocks (ARES) model, a novel database of both the temperature and moisture extremes facing individual crops by explicitly accounting for crop characteristics. Specifically, we estimate crop-specific temperature and moisture shocks during the growing season for a 0.25-degree spatial grid and daily time scale from 1961-2014 globally for 17 crops. The resulting database presented here provides annual crop- and event-specific exposure rates. Both gridded and country-level exposure rates are provided for each of the 17 crops. Our results provide new insights into the changes in the magnitude as well as spatial and temporal distribution of extreme events that impact crops over the past half-century. For additional information, please see the related paper by Jackson et al. (2021) in Environmental Research Letters.
Crop-specific; weather extremes; temperature; moisture; global; gridded; time series
Woods, Nathan (2021): RISRS Problems and Opportunities Dataset.. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2831687_V1
An Atlas.ti dataset and accompanying documentation of a thematic analysis of problems and opportunities associated with retracted research and its continued citation.
Retraction; Citation; Problems and Opportunities
Torvik, Vetle; Smalheiser, Neil (2021): Author-ity 2018 - PubMed author name disambiguated dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2273402_V1
Author-ity 2018 dataset Prepared by Vetle Torvik Apr. 22, 2021 The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018). A total of 29.1 million Article records and 114.2 million author name instances. Each instance of an author name is uniquely represented by the PMID and the position on the paper (e.g., 10786286_3 is the third author name on PMID 10786286). Thus, each cluster is represented by a collection of author name instances. The instances were first grouped into "blocks" by last name and first name initial (including some close variants), and then each block was separately subjected to clustering. The resulting clusters are provided in two different formats, the first in a file with only IDs and PMIDs, and the second in a file with cluster summaries: #################### File 1: au2id2018.tsv #################### Each line corresponds to an author name instance (PMID and Author name position) with an Author ID. It has the following tab-delimited fields: 1. Author ID 2. PMID 3. Author name position ######################## File 2: authority2018.tsv ######################### Each line corresponds to a predicted author-individual represented by cluster of author name instances and a summary of all the corresponding papers and author name variants. Each cluster has a unique Author ID (the PMID of the earliest paper in the cluster and the author name position). The summary has the following tab-delimited fields: 1. Author ID (or cluster ID) e.g., 3797874_1 represents a cluster where 3797874_1 is the earliest author name instance. 2. cluster size (number of author name instances on papers) 3. name variants separated by '|' with counts in parenthesis. Each variant of the format lastname_firstname middleinitial, suffix 4. last name variants separated by '|' 5. first name variants separated by '|' 6. middle initial variants separated by '|' ('-' if none) 7. suffix variants separated by '|' ('-' if none) 8. email addresses separated by '|' ('-' if none) 9. ORCIDs separated by '|' ('-' if none). From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 10. range of years (e.g., 1997-2009) 11. Top 20 most frequent affiliation words (after stoplisting and tokenizing; some phrases are also made) with counts in parenthesis; separated by '|'; ('-' if none) 12. Top 20 most frequent MeSH (after stoplisting) with counts in parenthesis; separated by '|'; ('-' if none) 13. Journal names with counts in parenthesis (separated by '|'), 14. Top 20 most frequent title words (after stoplisting and tokenizing) with counts in parenthesis; separated by '|'; ('-' if none) 15. Co-author names (lowercased lastname and first/middle initials) with counts in parenthesis; separated by '|'; ('-' if none) 16. Author name instances (PMID_auno separated by '|') 17. Grant IDs (after normalization; '-' if none given; separated by '|'), 18. Total number of times cited. (Citations are based on references harvested from open sources such as PMC). 19. h-index 20. Citation counts (e.g., for h-index): PMIDs by the author that have been cited (with total citation counts in parenthesis); separated by '|'
author name disambiguation; PubMed
Rapti, Zoi (2021): Temperate and chronic virus competition. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0705058_V1
All code in Matlab .m scripts or functions (version R2019b) Affiliated with article “Temperate and chronic virus competition leads to low lysogen frequency” published in the Journal of Theoretical Biology (2021) Codes simulate and plot the solutions of an Ordinary Differential Equations model and generate bifurcation diagrams.
Xia, Yushu; Wander, Michelle (2021): Response of Soil Quality Indictors including β-glucosidase, Fluorescein Diacetate Hydrolysis and Permanganate Oxidizable Carbon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2865725_V3
Dataset compiled by Yushu Xia and Michelle Wander for the Soil Health Institute. Data were recovered from peer reviewed literature reporting results for three soil quality indicators (SQIs) (β-glucosidase (BG), fluorescein diacetate (FDA) hydrolysis, and permanganate oxidizable carbon (POXC)) in terms of their relative response to management where soils under grassland cover, no-tillage, cover crops, residue return and organic amendments were compared to conventionally managed controls. Peer-reviewed articles published between January of 1990 and May 2018 were searched using the Thomas Reuters Web of Science database (Thomas Reuters, Philadelphia, Pennsylvania) and Google Scholar to identify studies reporting results for: “β-glucosidase”, “permanganate oxidizable carbon”, “active carbon”, “readily oxidizable carbon”, and “fluorescein diacetate hydrolysis”, together with one or more of the following: “management practice”, “tillage”, “cover crop”, “residue”, “organic fertilizer”, or “manure”. Records were tabulated to compare SQI abundance in soil maintained under a control and soil aggrading practice with the intent to contribute to SQI databases that will support development of interpretive frameworks and/or algorithms including pedo-transfer functions relating indicator abundance to management practices and site specific factors. Meta-data include the following key descriptor variables and covariates useful for development of scoring functions: 1) identifying factors for the study site (location, year of initiation of study and year in which data was reported), 2) soil textural class, pH, and SOC, 3) depth and timing of soil sampling, 4) analytical methods for SQI quantification, 5) units used in published works (i.e. equivalent mass, concentration), 6) SQI abundances, and 7) statistical significance of difference comparisons. *Note: Blank values in tables are considered unreported data.
Soil health promoting practices; Soil quality indicators; β-glucosidase; fluorescein diacetate hydrolysis; Permanganate oxidizable carbon; Greenhouse gas emissions; Scoring curves; Soil Management Assessment Framework
Lyu, Fangzheng; Kang, Jeon-Young; Wang, Shaohua; Han, Su; Li, Zhiyu; Wang, Shaowen; Padmanabhan, Anand (2021): Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0299659_V1
This dataset contains all the code, notebooks, datasets used in the study conducted for the research publication titled "Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19 Data". Specifically, this package include the artifacts used to conduct spatial-temporal analysis with space time kernel density estimation (STKDE) using COVID-19 data, which should help readers to reproduce some of the analysis and learn about the methods that were conducted in the associated book chapter. ## What’s inside - A quick explanation of the components of the zip file * Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19.ipynb is a jupyter notebook for this project. It contains codes for preprocessing, space time kernel density estimation, postprocessing, and visualization. * data is a folder containing all data needed for the notebook * data/county.txt: US counties information and fip code from Natural Resources Conservation Service. * data/us-counties.txt: County-level COVID-19 data collected from New York Times COVID-19 github repository on August 9th, 2020. * data/covid_death.txt: COVID-19 death information derived after preprocessing step, preparing the input data for STKDE. Each record is if the following format (fips, spatial_x, spatial_y, date, number of death ). * data/stkdefinal.txt: result obtained by conducting STKDE. * wolfram_mathmatica is a folder for 3D visulization code. * wolfram_mathmatica/Visualization.nb: code for visulization of STKDE result via weolfram mathmatica. * img is a folder for figures. * img/above.png: result of 3-D visulization result, above view. * img/side.png: result of 3-D visulization, side view.
CyberGIS; COVID-19; Space-time kernel density estimation; Spatiotemporal patterns
Xia, Yushu; Wander, Michelle; Kwon, Hoyoung (2021): County-level Data of Nitrogen Fertilizer and Manure Inputs for Corn Production in the United States. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3112432_V1
This dataset includes five files developed using the procedures described by the article 'Developing County-level Data of Nitrogen Fertilizer and Manure Inputs for Corn Production in the United States' and Supplemental Information published in the Journal of Cleaner Production in 2021.
Corn; Nitrogen Fertilizer; Manure; Countermonius US
Mischo, William (2021): Scopus API Scripts for Data Reuse Project. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0988473_V1
To generate the bibliographic and survey data to support a data reuse study conducted by several Library faculty and accepted for publication in the Journal of Academic Librarianship, the project team utilized a series of web-based online scripts that employed several different endpoints from the Scopus API. The related dataset: "Data for: An Examination of Data Reuse Practices within Highly Cited Articles of Faculty at a Research University" contains survey design and results. <br /> 1) <b>getScopus_API_process_dmp_IDB.asp</b>: used the search API query the Scopus database API for papers by UIUC authors published in 2015 -- limited to one of 9 pre-defined Scopus subject areas -- and retrieve metadata results sorted highest to lowest by the number of times the retrieved articles were cited. The URL for the basic searches took the following form: https://api.elsevier.com/content/search/scopus?query=(AFFIL%28(urbana%20OR%20champaign) AND univ*%29) OR (AF-ID(60000745) OR AF-ID(60005290))&apikey=xxxxxx&start=" & nstart & "&count=25&date=2015&view=COMPLETE&sort=citedby-count&subj=PHYS<br /> Here, the variable nstart was incremented by 25 each iteration and 25 records were retrieved in each pass. The subject area was renamed (e.g. from PHYS to COMP for computer science) in each of the 9 runs. This script does not use the Scopus API cursor but downloads 25 records at a time for up to 28 times -- or 675 maximum bibliographic records. The project team felt that looking at the most 675 cited articles from UIUC faculty in each of the 9 subject areas was sufficient to gather a robust, representative sample of articles from 2015. These downloaded records were stored in a temporary table that was renamed for each of the 9 subject areas. <br /> 2) <b>get_citing_from_surveys_IDB.asp</b>: takes a Scopus article ID (eid) from the 49 UIUC author returned surveys and retrieves short citing article references, 200 at a time, into a temporary composite table. These citing records contain only one author, no author affiliations, and no author email addresses. This script uses the Scopus API cursor=* feature and is able to download all the citing references of an article 200 records at a time. <br /> 3) <b>put_in_all_authors_affil_IDB.asp</b>: adds important data to the short citing records. The script adds all co-authors and their affiliations, the corresponding author, and author email addresses. <br /> 4) <b>process_for_final_IDB.asp</b>: creates a relational database table with author, title, and source journal information for each of the citing articles that can be copied as an Excel file for processing by the Qualtrics survey software. This was initially 4,626 citing articles over the 49 UIUC authored articles, but was reduced to 2,041 entries after checking for available email addresses and eliminating duplicates.
Scopus API; Citing Records; Most Cited Articles
Urco Cordero, Juan M.; Kamalabadi, Farzad; Kamaci, Ulas; Harding, Brian J.; Frey, Harald U.; Mende, Stephen B.; Huba, Joe D.; England, Scott L.; Immel, Thomas J. (2021): Data for Conjugate photoelectron energy spectra derived from coincident FUV and radio measurements. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2215727_V1
Conjugate photoelectron energy spectra derived from coincident FUV and radio measurements. These are outputs of simulations from the semi-empirical SAMI2-PE (Varney et al. 2012) for the night of January 4, 2020.
Conjugate photoelectrons, SAMI2-PE, ICON
Park, Minhyuk; Zaharias, Paul; Warnow, Tandy (2021): Disjoint Tree Mergers for Large-Scale Maximum LikelihoodTree Estimation. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7008049_V1
This dataset contains RNASim1000, Cox1-Het datasets as well as analyses of RNASim1000, Cox1-Het, and 1000M1(HF).
phylogeny estimation; maximum likelihood; RAxML; IQ-TREE; FastTree; cox1; heterotachy; disjoint tree mergers; Tree of Life
Larsen, Ryan J. ; Gagoski, Borjan; Morton, Sarah U.; Ou, Yangming; Vyas, Rutvi; Litt, Jonathan; Grant, P. Ellen; Sutton, Bradley P. (2021): Dataset for "Quantification of Magnetic Resonance Spectroscopy data using a combined reference: Application in typically developing infants. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3548139_V1
Magnetic Resonance Spectroscopy; quantification; combined reference; waters scaling; infant development; GABA
Cattai de Godoy, Maria (2021): Miscanthus grass as a novel functional fiber source in extruded feline diets . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3595148_V1
- The aim of this research was to evaluate the novel dietary fiber source, miscanthus grass, in comparison to traditional fiber sources, and their effects on the microbiota of healthy adult cats. Four dietary treatments, cellulose (CO), miscanthus grass fiber (MF), a blend of miscanthus fiber and tomato pomace (MF+TP), or beet pulp (BP) were evaluated.<br /><br />- The study was conducted using a completely randomized design with twenty-eight neutered adult, domesticated shorthair cats (19 females and 9 males, mean age 2.2 ± 0.03 yr; mean body weight 4.6 ± 0.7 kg, mean body condition score 5.6 ± 0.6). Total DNA from fresh fecal samples was extracted using Mo-Bio PowerSoil kits (MO BIO Laboratories, Inc., Carlsbad, CA). Amplification of the 292 bp-fragment of V4 region from the 16S rRNA gene was completed using a Fluidigm Access Array (Fluidigm Corporation, South San Francisco, CA). Paired-end Illumina sequencing was performed on a MiSeq using v3 reagents (Illumina Inc., San Diego, CA) at the Roy J. Carver Biotechnology Center at the University of Illinois. <br />- Filenames are composed of animal name identifier, diet (BP= beet pulp; CO= cellulose; MF= miscanthus grass fiber; TP= blend of miscanthus fiber and tomato pomace).
cats; dietary fiber; fecal microbiota; miscanthus grass; nutrient digestibility; postbiotics
Cattai de Godoy, Maria (2021): Use of legumes and yeast as novel dietary protein sources in extruded canine diets . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4677176_V1
- The objective of this study was to evaluate macronutrient apparent total tract digestibility (ATTD), gastrointestinal tolerance, and fermentative end-products in extruded, canine diets. <br />- Five diets were formulated to be isocaloric and isonitrogenous with either garbanzo beans (GBD), green lentils (GLD), peanut flour (PFD), dried yeast (DYD), or poultry by-product meal (CON) as the primary protein sources. Ten adult, intact, female beagles (mean age: 4.2 ± 1.1 yr, mean 28 weight: 11.9 ± 1.3 kg) were used in a replicated, 5x5 Latin square design with 14 d periods. Total DNA from fresh fecal samples was extracted using Mo-Bio PowerSoil kits (MO BIO Laboratories, Inc., Carlsbad, CA). Amplification of the 292 bp-fragment of V4 region from the 16S rRNA gene was completed using a Fluidigm Access Array (Fluidigm Corporation, South San Francisco, CA). Paired-end Illumina sequencing was performed on a MiSeq using v3 reagents (Illumina Inc., San Diego, CA) at the Roy J. Carver Biotechnology Center at the University of Illinois. <br />- Filenames are composed of animal name identifier, diet (CON=control; DY= dried yeast; GB= garbanzo beans; GL= green lentils; PF= peanut flour) and period replicate number (P1, P2, P3, P4, and P5).
Dog; Digestibility; Legume; Microbiota; Pulse; Yeast
Hadley, Daniel; Abrams, Daniel; Mannix, Devin; Cullen, Cecilia (2021): Model files and GIS data for risk assessment in the Cambrian-Ordovician sandstone aquifer system, Northeastern Illinois, predevelopment-2070. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4350211_V1
These datasets contain modeling files and GIS data associated with a risk assessment study for the Cambrian-Ordovician sandstone aquifer system in Illinois from predevelopment (1863) to the year 2070. Modeling work was completed using the Illinois Groundwater Flow Model, a regional MODFLOW model developed for water supply planning in Illinois, as a base model. The model is run using the graphical user interface Groundwater Vistas 7.0. The development and technical details of the base Illinois Groundwater Flow Model, including hydraulic property zonation, boundary conditions, hydrostratigraphy, solver settings, and discretization, are described in Abrams et al. (2018). Modifications to this base model (the version presented here) are described in Mannix et al. (2018), Hadley et al. (2020) and Abrams and Cullen (2020). Modifications include removal of particular multi-aquifer wells to improve calibration, changing Sandwich Fault Zone properties to achieve calibration at production wells within and near the fault zone, and the incorporation of demand scenarios based on a participatory modeling project with the Southwest Water Planning Group. The zipped folder of model files contains MODFLOW input (package) files, Groundwater Vistas files, and a head file for the entire model run. The zipped folder of GIS data contains rasters of: simulated drawdown in the St. Peter sandstone from predevelopment to 2018, simulated drawdown in the Ironton-Galesville sandstone from predevelopment to 2018, simulated head difference between the St. Peter and Ironton-Galesville sandstone units in 2018, simulated head above the top of the St. Peter sandstone for the years 2029, 2050, and 2070, and simulated head above the top of the Ironton-Galesville sandstone for the years 2029, 2050, and 2070. Raster outputs were derived directly from the simulated heads in the Illinois Groundwater Flow Model. Rasters are clipped to the 8 county northeastern Illinois region (Cook, DuPage, Grundy, Kane, Kendall, Lake, McHenry, and Will counties). Well names, historic and current head targets, and spatial offsets for the Illinois Groundwater Flow Model are available upon request via a data license agreement. Please contact authors to set this up if needed.
groundwater; aquifer; sandstone aquifer; risk assessment; depletion; Illinois; MODFLOW; modeling
Uelmen, Johnny (2021): Data for Dynamics of data availability in disease modeling: An example evaluating the trade-offs of ultra-fine-scale factors to human West Nile virus disease models in the Chicago area, USA. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5901636_V1
West Nile virus data, aggregated by 55 1-km hexagons, within the NWMAD jurisdiction Cook County, IL. The data incorporates deidentified human illness, mosquito infection and abundance, socio-economic data, and other abiotic and biotic predictors by epi-weeks 18-38 for the years 2005-2016.
Riemer, Nicole; Yao, Yu; Dawson, Matthew; Dabdub, Donald (2021): Data for:Evaluating the impacts of cloud processing on resuspended aerosol particles after cloud evaporation using a particle-resolved model. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8367769_V1
This dataset contains simulation results from PartMC-MOSAIC-CAPRAM used in the article "Evaluating the impacts of cloud processing on resuspended aerosol particles after cloud evaporation using a particle-resolved model". There are seven folders: one for urban plume simulation to provide the initial particle population for cloud processing, the other four folders are for the four cloud cycles simulated and the last two are for the coagulation cases. Within the urban plume simulation, there are 25 NetCDF less hourly output from PartMC-MOSAIC simulations containing the gas and particle information. Within the four cloud cycle folders, there are 25 subdirectories that contain the cloud processing results for the aerosol population from the urban plume environment. For each subdirectory, there are 30 NetCDF les out- put every minute from PartMC-MOSAIC-CAPRAM simulations containing aerosol and gas information after aqueous chemistry. Another two folders are for the cases considering Brownian coagulation and sedimentation coalescence. Each contained 90 NetCDF les, produced from repeating the 30-minutes simulations for three times to consider the coagulation randomness. This dataset was used to investigate the effects of cloud processing on aerosol mixing state and CCN properties.
cloud process; coagulation; aqueous chemistry; aerosol mixing state; CCN
Smirnov, Vladimir (2021): Datasets used in "Recursive MAGUS: scalable and accurate multiple sequence alignment". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1048258_V1
This archive contains the datasets used in the paper "Recursive MAGUS: scalable and accurate multiple sequence alignment". - 16S.3, 16S.T, 16S.B.ALL - HomFam - RNASim These can also be found at https://sites.google.com/eng.ucsd.edu/datasets/alignment/pastaupp
Zhao, Yifan; Sharif, Hashim; Adve, Vikram; Misailovic, Sasa (2021): ApproxTuner DNN Models. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6565690_V1
DNN weights used in the evaluation of the ApproxTuner system. Link to paper: https://dl.acm.org/doi/10.1145/3437801.3446108
planned publication date: 2021-11-16
Prada, Cecilia M.; Turner, Benjamin L.; Dalling, James W. (2021): Seedling traits in oak and mix stands. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7636863_V1
Data from an a field experiment at El Velo, Chiriqui, Republic of Panama. Data contain information about functional traits of seedlings growing in different treatments including type of forest, nitrogen addition and organic matter.
Mycorrhiza; nitrogen; oak forest; Panama; plant-soil feedbacks, seedling growth
Imker, Heidi J; Luong, Hoa; Mischo, William H; Schlembach, Mary C; Wiley, Chris (2021): Data for: An Examination of Data Reuse Practices within Highly Cited Articles of Faculty at a Research University. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2087785_V1
This dataset was developed as part of a study that assessed data reuse. Through bibliometric analysis, corresponding authors of highly cited papers published in 2015 at the University of Illinois at Urbana-Champaign in nine STEM disciplines were identified and then surveyed to determine if data were generated for their article and their knowledge of reuse by other researchers. Second, the corresponding authors who cited those 2015 articles were identified and surveyed to ascertain whether they reused data from the original article and how that data was obtained. The project goal was to better understand data reuse in practice and to explore if research data from an initial publication was reused in subsequent publications.
data reuse; data sharing; data management; data services; Scopus API
planned publication date: 2022-01-01
Cao, Yanghui; Dietrich, Christopher H. (2022): Datasets for "Phylogenomics of flavobacterial insect nutritional endosymbionts with implications for the phylogeny of their hosts". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-7486289_V1
The file “Fla.fasta”, comprising 10526 positions, is the concatenated amino acid alignments of 51 orthologues of 182 bacterial strains. It was used for the maximum likelihood and maximum parsimony analyses of Flavobacteriales. Bacterial species names and strains were used as the sequence names, host names of insect endosymbionts were shown in brackets. The file “16S.fasta” is the alignment of 233 bacterial 16S rRNA sequences. It contains 1455 positions and was used for the maximum likelihood analysis of flavobacterial insect endosymbionts. The names of endosymbiont strains were replaced by the name of their hosts. In addition to the species names, National Center for Biotechnology Information (NCBI) accession numbers were also indicated in the sequence names (e.g., sequence “Cicadellidae_Deltocephalinae_Macrostelini_Macrosteles_striifrons_AB795320” is the 16S rRNA of Macrosteles striifrons (Cicadellidae: Deltocephalinae: Macrostelini) with a NCBI accession number AB795320). The file “Sulcia_pep.fasta” is the concatenated amino acid alignments of 131 orthologues of “Candidatus Sulcia muelleri” (Sulcia). It contains 41970 positions and presents 101 Sulcia strains and 3 Blattabacterium strains. This file was used for the maximum likelihood analysis of Sulcia. The file “Sulcia_nucleotide.fasta” is the concatenated nucleotide alignment corresponding to the sequences in “Sulcia_pep.fasta” but also comprises the alignment of 16S rRNA. It has 127339 positions and was used for the maximum likelihood and maximum parsimony analyses of Sulcia. Individual gene alignments (16S rRNA and 131 orthologues of Sulcia and Blattabacterium) are deposited in the compressed file “individual_gene_alignments.zip”, which were used to construct gene trees for multispecies coalescent analysis. The names of Sulcia strains were replaced by the name of their hosts in “Sulcia_pep.fasta”, “Sulcia_nucleotide.fasta” and the files in “individual_gene_alignments.zip”. In all the alignment files, gaps are indicated by “-”.
endosymbiont, “Candidatus Sulcia muelleri”, Auchenorrhyncha, coevolution
Kang, Jeon-Young; Michels, Alexander; Lyu, Fangzheng; Wang, Shaohua; Agbodo, Nelson; Freeman, Vincent L; Wang, Shaowen; Anand, Padmanabhan (2021): Spatial accessibility of COVID-19 healthcare resources in Illinois, USA. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6582453_V1
This dataset contains all the code, notebooks, datasets used in the study conducted to measure the spatial accessibility of COVID-19 healthcare resources with a particular focus on Illinois, USA. Specifically, the dataset measures spatial access for people to hospitals and ICU beds in Illinois. The spatial accessibility is measured by the use of an enhanced two-step floating catchment area (E2FCA) method (Luo & Qi, 2009), which is an outcome of interactions between demands (i.e, # of potential patients; people) and supply (i.e., # of beds or physicians). The result is a map of spatial accessibility to hospital beds. It identifies which regions need more healthcare resources, such as the number of ICU beds and ventilators. This notebook serves as a guideline of which areas need more beds in the fight against COVID-19. ## What's Inside A quick explanation of the components of the zip file * `COVID-19Acc.ipynb` is a notebook for calculating spatial accessibility and `COVID-19Acc.html` is an export of the notebook as HTML. * `Data` contains all of the data necessary for calculations: * `Chicago_Network.graphml`/`Illinois_Network.graphml` are GraphML files of the OSMNX street networks for Chicago and Illinois respectively. * `GridFile/` has hexagonal gridfiles for Chicago and Illinois * `HospitalData/` has shapefiles for the hospitals in Chicago and Illinois * `IL_zip_covid19/COVIDZip.json` has JSON file which contains COVID cases by zip code from IDPH * `PopData/` contains population data for Chicago and Illinois by census tract and zip code. * `Result/` is where we write out the results of the spatial accessibility measures * `SVI/`contains data about the Social Vulnerability Index (SVI) * `img/` contains some images and HTML maps of the hospitals (the notebook generates the maps) * `README.md` is the document you're currently reading! * `requirements.txt` is a list of Python packages necessary to use the notebook (besides Jupyter/IPython). You can install the packages with `python3 -m pip install -r requirements.txt`
COVID-19; spatial accessibility; CyberGISX
Trivellone, Valeria; Wei, Wei; Filippin, Luisa; Dietrich, Christopher H (2021): FASTA file of the final sequence alignment used in the phylogenetic analyses of Phytoplasmas detected in leafhoppers. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2694515_V1
The PhytoplasmasRef_Trivellone_etal.fas fasta file contains the original final sequence alignment used in the phylogenetic analyses of Trivellone et al. (Ecology and Evolution, in review). The 27 sequences (21 phytoplasma reference strains and 6 phytoplasmas strains from the present study) were aligned using the Muscle algorithm as implemented in MEGA 7.0 with default settings. The final dataset contains 952 positions of the F2n/R2 fragment of the 16S rRNA gene. The data analyses are further described in the cited original paper.
Hemiptera; Cicadellidae; Mollicutes; Phytoplasma; biorepository
Mickalide, Harry (Avery); Kuehn, Seppe (2021): Data for: Higher-order interaction between species inhibits bacterial invasion of a phototroph-predator microbial community. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0946028_V2
These are abundance dynamics data and simulations for the paper "Higher-order interaction between species inhibits bacterial invasion of a phototroph-predator microbial community". In this V2, data were converted in Python, in addition to MATLAB and more information on how to work with the data was included in the Readme.
Microbial community; Higher order interaction; Invasion; Algae; Bacteria; Ciliate