Dataset Search

Displaying 626 - 650 of 748 in total

Filters

Subject Area

Life Sciences (466)

Social Sciences (117)

Physical Sciences (106)

Technology and Engineering (52)

Uncategorized

Funder

U.S. Department of Energy (DOE) (236)

Other (193)

U.S. National Science Foundation (NSF) (164)

U.S. National Institutes of Health (NIH) (63)

U.S. Department of Agriculture (USDA) (44)

Illinois Department of Natural Resources (IDNR) (19)

U.S. Geological Survey (USGS) (6)

U.S. National Aeronautics and Space Administration (NASA) (3)

U.S. Army (3)

Illinois Department of Transportation (IDOT) (2)

Publication Year

2025 (237)

2022 (81)

2021 (76)

2024 (72)

2020 (62)

2026 (61)

2018 (45)

2023 (45)

2019 (41)

2017 (19)

2016 (9)

License

CC BY (426)

CC0 (302)

custom (20)

Illinois Data Bank Dataset Search Results

Results

published: 2020-09-27

Data used to construct Table1 and Figs. 2 and 4 in Ainsworth & Long (2020) 30 Years of Free Air Carbon Dioxide Enrichment (FACE): What Have We Learned About Future Crop Productivity and the Potential for Adaptation? Global Change Biology

Long, Stephen (2020)

Data extracted from Text, Tables and Figures of publications in summarizing crop responses to Free-Air CO2 Elevation (FACE)

keywords: Free Air CO2 Elevation; FACE; wheat, rice, soybean, cassava;

published: 2018-07-25

Expert assessment of RobotReviewer data extraction performance on 10 articles

Scannapieco, Frank; Hoang, Linh; Schneider, Jodi (2018)

The PDF describes the process and data used for the heuristic user evaluation described in the related article “Evaluating an automatic data extraction tool based on the theory of diffusion of innovation” by Linh Hoang, Frank Scannapieco, Linh Cao, Yingjun Guan, Yi-Yun Cheng, and Jodi Schneider (under submission). Frank Scannapieco assessed RobotReviewer data extraction performance on ten articles in 2018-02. Articles are included papers from an update review: Sabharwal A., G.-F.I., Stellrecht E., Scannapeico F.A. Periodontal therapy to prevent the initiation and/or progression of common complex systemic diseases and conditions. An update. Periodontol 2000. In Press. The form was created in consultation with Linh Hoang and Jodi Schneider. To do the assessment, Frank Scannapieco entered PDFs for these ten articles into RobotReviewer and then filled in ten evaluation forms, based on the ten Robot Reviewer automatic data extraction reports. Linh Hoang analyzed these ten evaluation forms and synthesized Frank Scannapieco’s comments to arrive at the evaluation results for the heuristic user evaluation.

keywords: RobotReviewer; systematic review automation; data extraction

published: 2020-02-01

Habitat Use of Spring Migrating Dabbling Ducks in the Wabash River Valley

Williams, Benjamin R.; Benson, Thomas J. (2020)

This data describes habitat use, availability, landscape level influences, and daily movement of dabbling ducks in the Wabash River Valley of southeastern Illinois and southwestern Indiana. It contains triangulated locations of individual ducks, associated habitat assignments of those locations, flood survey data to determine water availability, and randomly generated points to assess landscape level questions.

keywords: waterfowl; ducks; dabbling; mallard; teal; habitat

published: 2022-02-09

Articles With PubMed Identifiers

Kansara, Yogeshwar; Hoang, Khanh Linh (2022)

The data file contains a list of articles with PMIDs information, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews".

keywords: Cochrane reviews; Randomized controlled trials; RCT; Automation; Systematic reviews

published: 2016-05-16

Phylogenetic Analysis of the NRPS AmbE Condensation Domains for the L-2-amino-4-methoxy-trans-3-butenoic acid (AMB) Biosynthetic Pathway in Pseudomonas aeruginosa

Imker, Heidi (2016)

This dataset contains the protein sequences and trees used to compare Non-Ribosomal Peptide Synthetase (NRPS) condensation domains in the AMB gene cluster and was used to create figure S1 in Rojas et al. 2015. Instead of having to collect representative sequences independently, this set of condensation domain sequences may serve as a quick reference set for coarse classification of condensation domains.

keywords: NRPS; biosynthetic gene cluster; antimetabolite; Pseudomonas; oxyvinylglycine; secondary metabolite; thiotemplate; toxin

published: 2022-10-13

NEXUS file for Phylogenetic analysis of the Idiocerus genus group (Hemiptera: Cicadellidae)

Xue, Qingquan; Xue, Qingquan; Dietrich, Christopher H.; Dietrich, Christopher H.; Zhang, Yalin; Zhang, Yalin (2022)

The text file contains the original DNA nucleotide sequence data used in the phylogenetic analyses of Xue et al. (in review), comprising the 13 protein-coding genes and 2 ribosomal gene subunits of the mitochondrial genome. The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 30 taxa (species) and 13078 characters, indicate that the characters are DNA sequence, that gaps inserted into the DNA sequence alignment are indicated by a dash, and that missing data are indicated by a question mark. The positions of data partitions are indicated in the mrbayes block of commands for the phylogenetic program MrBayes (version 3.2.6) beginning near the end of the file. The mrbayes block also contains instructions for MrBayes on various non-default settings for that program. These are explained in the Methods section of the submitted manuscript. Two supplementary tables in the provided PDF file provide additional information on the species in the dataset, including the GenBank accession numbers for the sequence data (Table S1) and the DNA substitution models used for each of the individual mitochondrial genes and for different codon positions of the protein-coding genes used for analyses in the programs MrBayes and IQ-Tree (version 1.6.8) (Table S2). Full citations for references listed in Table S1 can be found by searching GenBank using the corresponding accession number. The supplemental tables will also be linked to the article upon publication at the journal website.

keywords: Hemiptera; phylogeny; mitochondrial genome; morphology; leafhopper

published: 2023-07-05

The Salt Controversy Systematic Review Reports and Primary Study Reports Network Dataset

Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal; Lischwe Mueller, Natalie (2023)

The salt controversy is the public health debate about whether a population-level salt reduction is beneficial. This dataset covers 82 publications--14 systematic review reports (SRRs) and 68 primary study reports (PSRs)--addressing the effect of sodium intake on cerebrocardiovascular disease or mortality. These present a snapshot of the status of the salt controversy as of September 2014 according to previous work by epidemiologists: The reports and their opinion classification (for, against, and inconclusive) were from Trinquart et al. (2016) (Trinquart, L., Johns, D. M., & Galea, S. (2016). Why do we think we know what we know? A metaknowledge analysis of the salt controversy. International Journal of Epidemiology, 45(1), 251–260. https://doi.org/10.1093/ije/dyv184 ), which collected 68 PSRs, 14 SRRs, 11 clinical guideline reports, and 176 comments, letters, or narrative reviews. Note that our dataset covers only the 68 PSRs and 14 SRRs from Trinquart et al. 2016, not the other types of publications, and it adds additional information noted below. This dataset can be used to construct the inclusion network and the co-author network of the 14 SRRs and 68 PSRs. A PSR is "included" in an SRR if it is considered in the SRR's evidence synthesis. Each included PSR is cited in the SRR, but not all references cited in an SRR are included in the evidence synthesis or PSRs. Based on which PSRs are included in which SRRs, we can construct the inclusion network. The inclusion network is a bipartite network with two types of nodes: one type represents SRRs, and the other represents PSRs. In an inclusion network, if an SRR includes a PSR, there is a directed edge from the SRR to the PSR. The attribute file (report_list.csv) includes attributes of the 82 reports, and the edge list file (inclusion_net_edges.csv) contains the edge list of the inclusion network. Notably, 11 PSRs have never been included in any SRR in the dataset. They are unused PSRs. If visualized with the inclusion network, they will appear as isolated nodes. We used a custom-made workflow (Fu, Y. (2022). Scopus author info tool (1.0.1) [Python]. https://github.com/infoqualitylab/Scopus_author_info_collection ) that uses the Scopus API and manual work to extract and disambiguate authorship information for the 82 reports. The author information file (salt_cont_author.csv) is the product of this workflow and can be used to compute the co-author network of the 82 reports. We also provide several other files in this dataset. We collected inclusion criteria (the criteria that make a PSR eligible to be included in an SRR) and recorded them in the file systematic_review_inclusion_criteria.csv. We provide a file (potential_inclusion_link.csv) recording whether a given PSR had been published as of the search date of a given SRR, which makes the PSR potentially eligible for inclusion in the SRR. We also provide a bibliography of the 82 publications (supplementary_reference_list.pdf). Lastly, we discovered minor discrepancies between the inclusion relationships identified by Trinquart et al. (2016) and by us. Therefore, we prepared an additional edge list (inclusion_net_edges_trinquart.csv) to preserve the inclusion relationships identified by Trinquart et al. (2016). UPDATES IN THIS VERSION COMPARED TO V2 (Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal (2022): The Salt Controversy Systematic Review Reports and Primary Study Reports Network Dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6128763_V2) - We added a new column "pub_date" to report_list.csv - We corrected mistakes in supplementary_reference_list.pdf for report #28 and report #80. The author of report #28 is not Salisbury D but Khaw, K.-T., & Barrett-Connor, E. Report #80 was mistakenly mixed up with report #81.

keywords: systematic reviews; evidence synthesis; network analysis; public health; salt controversy;

published: 2025-03-12

2D Acoustic Numerical Breast Phantoms and Measurement Data for Joint Reconstruction in PACT

Jeong, Gangwon; Villa, Umberto; Park, Seonyeong; Anastasio, Mark A. (2025)

References - Jeong, Gangwon, Umberto Villa, and Mark A. Anastasio. "Revisiting the joint estimation of initial pressure and speed-of-sound distributions in photoacoustic computed tomography with consideration of canonical object constraints." Photoacoustics (2025): 100700. - Park, Seonyeong, et al. "Stochastic three-dimensional numerical phantoms to enable computational studies in quantitative optoacoustic computed tomography of breast cancer." Journal of biomedical optics 28.6 (2023): 066002-066002. Overview - This dataset includes 80 two-dimensional slices extracted from 3D numerical breast phantoms (NBPs) for photoacoustic computed tomography (PACT) studies. The anatomical structures of these NBPs were obtained using tools from the Virtual Imaging Clinical Trial for Regulatory Evaluation (VICTRE) project. The methods used to modify and extend the VICTRE NBPs for use in PACT studies are described in the publication cited above. - The NBPs in this dataset represent the following four ACR BI-RADS breast composition categories: > Type A - The breast is almost entirely fatty > Type B - There are scattered areas of fibroglandular density in the breast > Type C - The breast is heterogeneously dense > Type D - The breast is extremely dense - Each 2D slice is taken from a different 3D NBP, ensuring that no more than one slice comes from any single phantom. File Name Format - Each data file is stored as a .mat file. The filenames follow this format: {type}{subject_id}.mat where{type} indicates the breast type (A, B, C, or D), and {subject_id} is a unique identifier assigned to each sample. For example, in the filename D510022534.mat, "D" represents the breast type, and "510022534" is the sample ID. File Contents - Each file contains the following variables: > "type": Breast type > "p0": Initial pressure distribution [Pa] > "sos": Speed-of-sound map [mm/μs] > "att": Acoustic attenuation (power-law prefactor) map [dB/ MHzʸ mm] > "y": power-law exponent > "pressure_lossless": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, under the assumption of a lossless medium (corresponding to Studies I, II, and III). > "pressure_lossy": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, incorporating a power-law acoustic absorption model to account for medium losses (corresponding to Study IV). * The pressure data were simulated using a ring-array transducer that consists of 512 receiving elements uniformly distributed along a ring with a radius of 72 mm. * Note: These pressure data are noiseless simulations. In Studies II–IV of the referenced paper, additive Gaussian i.i.d. noise were added to the measurement data. Users may add similar noise to the provided data as needed for their own studies. - In Study I, all spatial maps (e.g., sos) have dimensions of 512 × 512 pixels, with a pixel size of 0.32 mm × 0.32 mm. - In Study II and Study III, all spatial maps (sos) have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm. - In Study IV, both the sos and att maps have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm.

keywords: Medical imaging; Photoacoustic computed tomography; Numerical phantom; Joint reconstruction

published: 2019-10-16

TREC document topic annotations

Sherman, Garrick (2019)

Human annotations of randomly selected judged documents from the AP 88-89, Robust 2004, WT10g, and GOV2 TREC collections. Seven annotators were asked to read documents in their entirety and then select up to ten terms they felt best represented the main topic(s) of the document. Terms were chosen from among a set sampled from the document in question and from related documents.

keywords: TREC; information retrieval; document topicality; document description

published: 2022-03-09

Codes for the analysis of an eco-immunological disease-transmission mathematical model

Rapti, Zoi; Rivera Quinones, Vanessa; Stewart Merrill, Tara (2022)

MATLAB files for the analysis of an ODE model for disease transmission. The codes may be used to find equilibrium points, study transient dynamics, evaluate the basic reproductive number (R0), and simulate the model when parameters depend on the independent variables. In addition, the codes may be used to perform local sensitivity analysis of R0 on the model parameters.

published: 2025-10-10

Sorghum bicolor Core Metabolism Model

Clark, Teresa J.; Schwender, Jorg (2025)

Upregulation of triacylglycerols (TAGs) in vegetative plant tissues such as leaves has the potential to drastically increase the energy density and biomass yield of bioenergy crops. In this context, constraint-based analysis has the promise to improve metabolic engineering strategies. Here we present a core metabolism model for the C4 biomass crop Sorghum bicolor (iTJC1414) along with a minimal model for photosynthetic CO2 assimilation, sucrose and TAG biosynthesis in C3 plants. Extending iTJC1414 to a four-cell diel model we simulate C4 photosynthesis in mature leaves with the principal photo-assimilatory product being replaced by TAG produced at different levels. Independent of specific pathways and per unit carbon assimilated, energy content and biosynthetic demands in reducing equivalents are about 1.3 to 1.4 times higher for TAG than for sucrose. For plant generic pathways, ATP- and NADPH-demands per CO2 assimilated are higher by 1.3- and 1.5-fold, respectively. If the photosynthetic supply in ATP and NADPH in iTJC1414 is adjusted to be balanced for sucrose as the sole photo-assimilatory product, overproduction of TAG is predicted to cause a substantial surplus in photosynthetic ATP. This means that if TAG synthesis was the sole photo-assimilatory process, there could be an energy imbalance that might impede the process. Adjusting iTJC1414 to a photo-assimilatory rate that approximates field conditions, we predict possible daily rates of TAG accumulation, dependent on varying ratios of carbon partitioning between exported assimilates and accumulated oil droplets (TAG, oleosin) and in dependence of activation of futile cycles of TAG synthesis and degradation. We find that, based on the capacity of leaves for photosynthetic synthesis of exported assimilates, mature leaves should be able to reach a 20% level of TAG per dry weight within one month if only 5% of the photosynthetic net assimilation can be allocated into oil droplets. From this we conclude that high TAG levels should be achievable if TAG synthesis is induced only during a final phase of the plant life cycle.

keywords: Feedstock Production;Modeling

published: 2021-11-23

Data for: Evaluating the impacts of cloud processing on resuspended aerosol particles after cloud evaporation using a particle-resolved model

Riemer, Nicole; Yao, Yu; Dawson, Matthew; Dabdub, Donald (2021)

This dataset contains simulation results from PartMC-MOSAIC-CAPRAM used in the article ”Eval- uating the impacts of cloud processing on resuspended aerosol particles after cloud evaporation using a particle-resolved model”. In this V2, there are eight folders: one for urban plume simulation to provide the initial particle population for cloud processing, the other four folders are for the four cloud cycles simulated and the last two are for the coagulation cases. Within the urban plume simulation, there are 25 NetCDF files hourly output from PartMC-MOSAIC simulations containing the gas and particle information. Within the four cloud cycle folders, there are 25 subdirectories that contain the cloud processing results for aerosol population from urban plume environment. For each subdirectory, there are 31 NetCDF files out- put every minute from PartMC-MOSAIC-CAPRAM simulations containing aerosol and gas information after aqueous chemistry. Another two folders are for the cases considering Brownian coagulation and sedimentation coalescence. Each contained 93 NetCDF files, produced from repeating the 30-minutes simulations for three times to consider the coagulation randomness. The low polluted case folder includes the simulated cloud processing results for 25 urban plume cases with less aerosol number concentration. This dataset was used to investigate the effects of cloud processing on aerosol mixing state and CCN properties.

keywords: cloud process; coagulation; aqueous chemistry; aerosol mixing state; CCN

published: 2020-10-20

Airyscan confocal superresolution images of fossil and modern pollen of Amherstieae (Fabaceae)

Romero, Ingrid; Urban, Michael A.; Punyasena, Surangi (2020)

This dataset includes a total of 501 images of 42 fossil specimens of Striatopollis and 459 specimens of 45 extant species of the tribe Amherstieae-Fabaceae. These images were taken using Airyscan confocal superresolution microscopy at 630X magnification (63x/NA 1.4 oil DIC). The images are in the CZI file format. They can be opened using Zeiss propriety software (Zen, Zen lite) or in ImageJ. More information on how to open CZI files can be found here: [https://www.zeiss.com/microscopy/us/products/microscope-software/zen/czi.html#microscope---image-data].

keywords: Striatopollis catatumbus; superresolution microscopy; Cenozoic; tropics; Zeiss; CZI; striate pollen.

published: 2021-02-18

Data for Integrating CyberGIS and Urban Sensing for Reproducible Streaming Analytics

Wang, Shaowen; Lyu, Fangzheng; Wang, Shaohua; Catlet, Charles; Padmanabhan, Anand; Soltani, Kiumars (2021)

Increasingly pervasive location-aware sensors interconnected with rapidly advancing wireless network services are motivating the development of near-real-time urban analytics. This development has revealed both tremendous challenges and opportunities for scientific innovation and discovery. However, state-of-the-art urban discovery and innovation are not well equipped to resolve the challenges of such analytics, which in turn limits new research questions from being asked and answered. Specifically, commonly used urban analytics capabilities are typically designed to handle, process, and analyze static datasets that can be treated as map layers and are consequently ill-equipped in (a) resolving the volume and velocity of urban big data; (b) meeting the computing requirements for processing, analyzing, and visualizing these datasets; and (c) providing concurrent online access to such analytics. To tackle these challenges, we have developed a novel cyberGIS framework that includes computationally reproducible approaches to streaming urban analytics. This framework is based on CyberGIS-Jupyter, through integration of cyberGIS and real-time urban sensing, for achieving capabilities that have previously been unavailable toward helping cities solve challenging urban informatics problems. The files included in this dataset functions as follows: 1) Spatial_interpolation.ipynb is a python based Jupyter notebook that enables users to conduct spatial interpolation with AoT data; 2) Urban_Informatics.ipynb is a Jupyter notebook that helps to explore the AoT dataset; 3) chicago-complete.weekly.2019-09-30-to-2019-10-06.tar includes all the high-frequency urban sensing data from AoT sensors from 2019 September 30th to 2019 October 6th collected in Chicago, US; 4) sensors.csv is a processed dataset including information about the temperature in Chicago, and it is used in Spatial_interpolation.ipynb.

keywords: CyberGIS; Urban informatics; Array of Things

published: 2025-01-15

Seasonal Variation in Responses of Largemouth Bass Caught During Live-Release Angling Tournaments

Suski, Cory; Hay, Allison (2025)

Data was generated from acoustic transmitters implanted in tournament caught and non-angled control largemouth bass across multiple seasons. This data was used to quantify post-release movement, behavior, and mortality in response to angling tournaments at different times of year and varying water temperatures.

published: 2019-05-20

Tetris artificial spin ice kinetics

Lao, Yuyang; Schiffer, Peter (2019)

This is the experimental data of tetris artificial spin ice. The islands are made of Permalloy materials with size of 170 nm by 470 nm by 2.5 nm. The systems are measured at a temperature where the islands are fluctuating around room temperature. The data is recorded as photoemission electron microscopy intensity. More details about the dataset can be found in the file Note.txt and Tetris_data_list.xlsx Note: 2 files name bl11_teris600_033 and bl11_tetris600_2_135 are not recorded in the excel sheet because they are corrupted during the measurement. Any data that is not recorded in the excel sheet is either corrupted or of low quality. From files *_028 to *_049, tetris is spelled with “t” while in the raw data folder without “t”. This is a typo. Throughout the dataset, tetris and teris are supposed to have the same meaning.

keywords: artificial spin ice

published: 2020-10-15

BEPAM Model Code and CABBI Simulation Results for "Assessing the Additional Carbon Savings with Biofuel"

Khanna, Madhu; Wang, Weiwei; Wang, Michael (2020)

This dataset consists of various input data that are used in the GAMS model. All the data are in the format of .inc which can be read within GAMS or Notepad. Main data sources include: acreage data (acre), crop budget data ($/acre), crop yield data (e.g. bushel/acre), Soil carbon sequestration data (KgCO2/ha/yr). Model details can be found in the "Assessing the Additional Carbon Savings with Biofuel" and GAMS model package. ## File Description (1) GAMS Model.zip: This includes all the input files and scripts for running the model (2) Table*.csv: These files include the data from the tables in the manuscript (3) Figure2_3_4.csv: This contains the data used to create the figures in the manuscript (4) BaselineResults.csv: This includes a summary of the model results. (5) SensitivityResults_*.csv: Model results from the various sensitivity analyses performed (6) LUC_emission.csv: land use change emissions by crop reporting district for changes of pasturelands to annual crops.

keywords: Biogenic carbon intensity; Corn ethanol; Economic model; Dynamic optimization; Anticipated baseline approach; Life cycle carbon intenisty

published: 2022-02-11

Error Analysis

Hoang, Khanh Linh; Schneider, Jodi; Kansara, Yogeshwar (2022)

The data contains a list of articles given low score by the RCT Tagger and an error analysis of them, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews". Change made in this V3 is that the data is divided into two parts: - Error Analysis of 44 Low Scoring Articles with MEDLINE RCT Publication Type. - Error Analysis of 244 Low Scoring Articles without MEDLINE RCT Publication Type.

keywords: Cochrane reviews; automation; randomized controlled trial; RCT; systematic reviews

published: 2025-07-30

Data for 'Linking climate sensitivity, trait combinations, and listing status to advance freshwater invertebrate conservation'

Skorupa, A. J.; Bried, J. T. (2025)

This dataset includes three data files for linking species' climate sensitivity, trait combinations, and listing status. It contains species occurrence data within Hydrologic Unit Code 12 (HUC12) watersheds, along with trait information and Rarity and Climate Sensitivity (RCS) index scores for lotic caddisflies, stoneflies, mussels, dragonflies, and crayfish across all Midwest Climate Adaptation Science Center states: Minnesota, Iowa, Missouri, Wisconsin, Illinois, Indiana, Michigan, and Ohio. For mussels, the geographic scope is expanded to include all Midwest Regional Species of Greatest Conservation Need (RSGCN) states—North Dakota, South Dakota, Nebraska, Kansas, and Kentucky. However, occurrence data for mussels is not included due to data-sharing agreements. Metadata are included with each data file. Please refer to the associated manuscript for original data sources, trait references, and details on the RCS index calculation.

keywords: climate sensitivity; conservation status; traits; aquatic invertebrates; Midwest

published: 2019-08-05

Data for Phylogenomics of Auchenorrhyncha (Insecta: Hemiptera) using Transcriptomes: Examining Controversial Relationships via Degeneracy Coding and Interrogation of Gene Conflict

Skinner, Rachel; Dietrich, Christopher; Walden, Kimberly; Gordon, Eric; Sweet, Andrew; Podsiadlowski, Lars; Petersen, Malte; Simon, Chris; Takiya, Daniela; Johnson, Kevin (2019)

The data in this directory corresponds to: Skinner, R.K., Dietrich, C.H., Walden, K.K.O., Gordon, E., Sweet, A.D., Podsiadlowski, L., Petersen, M., Simon, C., Takiya, D.M., and Johnson, K.P. Phylogenomics of Auchenorrhyncha (Insecta: Hemiptera) using Transcriptomes: Examining Controversial Relationships via Degeneracy Coding and Interrogation of Gene Conflict. Systematic Entomology. Correspondance should be directed to: Rachel K. Skinner, rskinn2@illinois.edu If you use these data, please cite our paper in Systematic Entomology. The following files can be found in this dataset: Amino_acid_concatenated_alignment.phy: the amino acid alignment used in this analysis in phylip format. Amino_acid_raxml_partitions.txt (for reference only): the partitions for the amino acid alignment, but a partitioned amino acid analysis was not performed in this study. Amino_acid_concatenated_tree.newick: the best maximum likelihood tree with bootstrap values in newick format. ASTRAL_input_gene_trees.tre: the concatenated gene tree input file for ASTRAL README_pie_charts.md: explains the the scripts and data needed to recreate the pie charts figure from our paper. There is also another Corresponds to the following files: ASTRAL_species_tree_EN_only.newick: the species tree with only effective number (EN) annotation ASTRAL_species_tree_pp1_only.newick: the species tree with only the posterior probability 1 (main topology) annotation ASTRAL_species_tree_q1_only.newick: the species tree with only the quartet scores for the main topology (q1) ASTRAL_species_tree_q2_only.newick: the species tree with only the quartet scores for the first alternative topology (q2) ASTRAL_species_tree_q3_only.newick: the species tree with only the quartet scores for the second alternative topology (q3) print_node_key_files.py: script needed to create the following files: node_keys.key: text file with node IDs and topologies complete_q_scores.key: text file with node IDs multiplied q scores EN_node_vals.key: text file with node IDs and EN values create_pie_charts_tree.py: script needed to visualize the tree with pie charts, pp1, and EN values plotted at nodes ASTRAL_species_tree_full_annotation.newick: the species tree with full annotation from the ASTRAL analysis. NOTE: It may be more useful to examine individual value files if you want to visualize the tree, e.g., in figtree, since the full annotations are extensive and can make viewing difficult. Complete_NT_concatenated_alignment.phy: the nucleotide alignment that includes unmodified third codon positions. The alignment is in phylip format. Complete_NT_raxml_partitions.txt: the raxml-style partition file of the nucleotide partitions Complete_NT_concatenated_tree.newick: the best maximum likelihood tree from the concatenated complete analysis NT with bootstrap values in newick format Complete_NT_partitioned_tree.newick: the best maximum likelihood tree from the partitioned complete NT analysis with bootstrap values in newick format Degeneracy_coded_nt_concatenated_alignment.phy: the degeneracy coded nucleotide alignment in phylip format Degeneracy_coded_nt_raxml_partitions.txt: the raxml-style partition file for the degeneracy coded nucleotide alignment Degeneracy_coded_nt_concatenated_tree.newick: the best maximum likelihood tree from the degeneracy-coded concatenated analysis with bootstrap values in newick format Degeneracy_coded_nt_partitioned_tree.newick: the best maximum likelihood tree from the degeneracy-coded partitioned analysis with bootstrap values in newick format count_ingroup_taxa.py: script that counts the number of ingroup and/or outgroup taxa present in an alignment

keywords: Auchenorrhyncha; Hemiptera; alignment; trees

published: 2020-01-20

Data for: Revising the Ozone Depletion Potentials for Short-Lived Chemicals such as CF3I and CH3I

Zhang, Jun; Wuebbles, Donald; Kinnison, Douglas; Saiz López, Alfonso (2020)

This datasets provide basis of our analysis in the paper - Revising the Ozone Depletion Potentials for Short-Lived Chemicals such as CF3I and CH3I. All datasets here are from the model output (CAM4-chem). All the simulations (background and perturbation) were run to steady-state and only the last year outputs used in analysis are archived here.

keywords: Illinois Data Bank; NetCDF; Ozone Depletion Potential; CF3I and CH3I

published: 2025-10-10

Data for Cloning and Characterization of a Panel of Mitochondrial Targeting Sequences for Compartmentalization Engineering in Saccharomyces cerevisiae

Dong, Chang; Shi, Zhuwei; Huang, Lei; Zhao, Huimin; Xu, Zhinan; Lian, Jiazhang (2025)

Mitochondrion is generally considered as the most promising subcellular organelle for compartmentalization engineering. Much progress has been made in reconstituting whole metabolic pathways in the mitochondria of yeast to harness the precursor pools (i.e., pyruvate and acetyl-CoA), bypass competing pathways, and minimize transportation limitations. However, only a few mitochondrial targeting sequences (MTSs) have been characterized (i.e., MTS of COX4), limiting the application of compartmentalization engineering for multigene biosynthetic pathways in the mitochondria of yeast. In the present study, based on the mitochondrial proteome, a total of 20 MTSs were cloned and the efficiency of these MTSs in targeting heterologous proteins, including the Escherichia coli FabI and enhanced green fluorescence protein (EGFP) into the mitochondria was evaluated by growth complementation and confocal microscopy. After systematic characterization, six of the well-performed MTSs were chosen for the colocalization of complete biosynthetic pathways into the mitochondria. As proof of concept, the full α-santalene biosynthetic pathway consisting of 10 expression cassettes capable of converting acetyl-coA to α-santalene was compartmentalized into the mitochondria, leading to a 3.7-fold improvement in the production of α-santalene. The newly characterized MTSs should contribute to the expanded metabolic engineering and synthetic biology toolbox for yeast mitochondrial compartmentalization engineering.

keywords: Conversion;Metabolic Engineering

published: 2020-03-03

Network of First and Second-generation citations to Matsuyama 2005 from Google Scholar and Web of Science

Schneider, Jodi; Ye, Di (2020)

This second version (V2) provides additional data cleaning compared to V1, additional data collection (mainly to include data from 2019), and more metadata for nodes. Please see NETWORKv2README.txt for more detail.

keywords: citations; retraction; network analysis; Web of Science; Google Scholar; indirect citation

published: 2020-11-05

Data from Species Distribution, Phylogenetic Structure, and Functional Roles of Detritius Inhabiting Fungi Across Contrasting Aquatic Environments.

Miller, Andrew; Raudabaugh, Daniel (2020)

This version 2 dataset contains 34 files in total with one (1) additional file, called "Culture-dependent Isolate table with taxonomic determination and sequence data.csv". The remaining files (33) are identical to version 1. The following is the information about the new file and its variables: Culture-dependent Isolate table with taxonomic determination and sequence data.csv: Culture table with assigned taxonomy from NCBI. Single direction sequence for each isolate is include if one could be obtained. Sequence is derived from ITS1F-ITS4 PCR amplicons, with Sanger sequencing in one direction using ITS5. The files contains 20 variables with explanation as below: IsolateNumber : unique number identify each isolate cultured Time: season in which the sample was collected Location: the specific name of the location Habitat: type of habitat : either stream or peatland State: state in the USA in which the specific location is located Incubation_pH ID: pH of the medium during isolation of fungal cultures Genus: phylogenetic genus of the fungal isolates (determined by sequence similarity) Sequence_quality: base call quality of the entire sequence used for blast analysis, if known %_coverage: sequence coverage reported from GenBank %_ID: sequence similarity reported from GenBank Life_style : ecological life style if known Phylum: phylogenetic phylum as indicated by Index Fungorum Subphylum: phylogenetic subphylum as indicated by Index Fungorum Class: phylogenetic class as indicated by Index Fungorum Subclass: phylogenetic subclass as indicated by Index Fungorum Order: phylogenetic order as indicated by Index Fungorum Family: phylogenetic Family as indicated by Index Fungorum ITS5_Sequence: single direction sequence used for sequence similarity match using blastn. Primer ITS5 Fasta: sequence with nomenclature in a fasta format for easy cut and paste into phylogenetic software Note: blank cells mean no data is available or unknown.

keywords: ITS1 forward reads; Illumina; peatlands; streams; bogs; fens

published: 2019-07-04

Control of bacterial infections via antibiotic-induced proviruses

Rapti, Zoi (2019)

Software (Matlab .m files) for the article: Lying in Wait: Modeling the Control of Bacterial Infections via Antibiotic-Induced Proviruses. The files can be used to reproduce the analysis and figures in the article.

keywords: Matlab codes; antibiotic-induced dynamics