Illinois Data Bank Dataset Search Results
Results
published:
2025-03-12
Jeong, Gangwon; Villa, Umberto; Park, Seonyeong; Anastasio, Mark A.
(2025)
References
- Jeong, Gangwon, Umberto Villa, and Mark A. Anastasio. "Revisiting the joint estimation of initial pressure and speed-of-sound distributions in photoacoustic computed tomography with consideration of canonical object constraints." Photoacoustics (2025): 100700.
- Park, Seonyeong, et al. "Stochastic three-dimensional numerical phantoms to enable computational studies in quantitative optoacoustic computed tomography of breast cancer." Journal of biomedical optics 28.6 (2023): 066002-066002.
Overview
- This dataset includes 80 two-dimensional slices extracted from 3D numerical breast phantoms (NBPs) for photoacoustic computed tomography (PACT) studies. The anatomical structures of these NBPs were obtained using tools from the Virtual Imaging Clinical Trial for Regulatory Evaluation (VICTRE) project. The methods used to modify and extend the VICTRE NBPs for use in PACT studies are described in the publication cited above.
- The NBPs in this dataset represent the following four ACR BI-RADS breast composition categories:
> Type A - The breast is almost entirely fatty
> Type B - There are scattered areas of fibroglandular density in the breast
> Type C - The breast is heterogeneously dense
> Type D - The breast is extremely dense
- Each 2D slice is taken from a different 3D NBP, ensuring that no more than one slice comes from any single phantom.
File Name Format
- Each data file is stored as a .mat file. The filenames follow this format: {type}{subject_id}.mat where{type} indicates the breast type (A, B, C, or D), and {subject_id} is a unique identifier assigned to each sample. For example, in the filename D510022534.mat, "D" represents the breast type, and "510022534" is the sample ID.
File Contents
- Each file contains the following variables:
> "type": Breast type
> "p0": Initial pressure distribution [Pa]
> "sos": Speed-of-sound map [mm/μs]
> "att": Acoustic attenuation (power-law prefactor) map [dB/ MHzʸ mm]
> "y": power-law exponent
> "pressure_lossless": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, under the assumption of a lossless medium (corresponding to Studies I, II, and III).
> "pressure_lossy": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, incorporating a power-law acoustic absorption model to account for medium losses (corresponding to Study IV).
* The pressure data were simulated using a ring-array transducer that consists of 512 receiving elements uniformly distributed along a ring with a radius of 72 mm.
* Note: These pressure data are noiseless simulations. In Studies II–IV of the referenced paper, additive Gaussian i.i.d. noise were added to the measurement data. Users may add similar noise to the provided data as needed for their own studies.
- In Study I, all spatial maps (e.g., sos) have dimensions of 512 × 512 pixels, with a pixel size of 0.32 mm × 0.32 mm.
- In Study II and Study III, all spatial maps (sos) have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm.
- In Study IV, both the sos and att maps have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm.
keywords:
Medical imaging; Photoacoustic computed tomography; Numerical phantom; Joint reconstruction
published:
2016-05-16
This dataset contains the protein sequences and trees used to compare Non-Ribosomal Peptide Synthetase (NRPS) condensation domains in the AMB gene cluster and was used to create figure S1 in Rojas et al. 2015. Instead of having to collect representative sequences independently, this set of condensation domain sequences may serve as a quick reference set for coarse classification of condensation domains.
keywords:
NRPS; biosynthetic gene cluster; antimetabolite; Pseudomonas; oxyvinylglycine; secondary metabolite; thiotemplate; toxin
published:
2023-01-12
Mischo, William; Schlembach, Mary C.; Cabada, Elisandro
(2023)
This dataset was developed as part of a study that examined the correlational relationships between local journal authorship, local and external citation counts, full-text downloads, link-resolver clicks, and four global journal impact factor indices within an all-disciplines journal collection of 12,200 titles and six subject subsets at the University of Illinois at Urbana-Champaign (UIUC) Library. While earlier investigations of the relationships between usage (downloads) and citation metrics have been inconclusive, this study shows strong correlations in the all-disciplines set and most subject subsets. The normalized Eigenfactor was the only global impact factor index that correlated highly with local journal metrics. Some of the identified disciplinary variances among the six subject subsets may be explained by the journal publication aspirations of UIUC researchers. The correlations between authorship and local citations in the six specific subject subsets closely match national department or program rankings.
All the raw data used in this analysis, in the form of relational database tables with multiple columns. Can be opned using MS Access. Description for variables can be viewed through "Design View" (by right clik on the selected table, choose "Design View"). The 2 PDF files provide an overview of tables are included in each MDB file.
In addition, the processing scripts and Pearson correlation code is available at <a href="https://doi.org/10.13012/B2IDB-0931140_V1">https://doi.org/10.13012/B2IDB-0931140_V1</a>.
keywords:
Usage and local citation relationships; publication; citation and usage metrics; publication; citation and usage correlation analysis; Pearson correlation analysis
published:
2023-07-05
Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal; Lischwe Mueller, Natalie
(2023)
The salt controversy is the public health debate about whether a population-level salt reduction is beneficial. This dataset covers 82 publications--14 systematic review reports (SRRs) and 68 primary study reports (PSRs)--addressing the effect of sodium intake on cerebrocardiovascular disease or mortality. These present a snapshot of the status of the salt controversy as of September 2014 according to previous work by epidemiologists: The reports and their opinion classification (for, against, and inconclusive) were from Trinquart et al. (2016) (Trinquart, L., Johns, D. M., & Galea, S. (2016). Why do we think we know what we know? A metaknowledge analysis of the salt controversy. International Journal of Epidemiology, 45(1), 251–260. https://doi.org/10.1093/ije/dyv184 ), which collected 68 PSRs, 14 SRRs, 11 clinical guideline reports, and 176 comments, letters, or narrative reviews. Note that our dataset covers only the 68 PSRs and 14 SRRs from Trinquart et al. 2016, not the other types of publications, and it adds additional information noted below.
This dataset can be used to construct the inclusion network and the co-author network of the 14 SRRs and 68 PSRs. A PSR is "included" in an SRR if it is considered in the SRR's evidence synthesis. Each included PSR is cited in the SRR, but not all references cited in an SRR are included in the evidence synthesis or PSRs. Based on which PSRs are included in which SRRs, we can construct the inclusion network. The inclusion network is a bipartite network with two types of nodes: one type represents SRRs, and the other represents PSRs. In an inclusion network, if an SRR includes a PSR, there is a directed edge from the SRR to the PSR. The attribute file (report_list.csv) includes attributes of the 82 reports, and the edge list file (inclusion_net_edges.csv) contains the edge list of the inclusion network. Notably, 11 PSRs have never been included in any SRR in the dataset. They are unused PSRs. If visualized with the inclusion network, they will appear as isolated nodes.
We used a custom-made workflow (Fu, Y. (2022). Scopus author info tool (1.0.1) [Python]. https://github.com/infoqualitylab/Scopus_author_info_collection ) that uses the Scopus API and manual work to extract and disambiguate authorship information for the 82 reports. The author information file (salt_cont_author.csv) is the product of this workflow and can be used to compute the co-author network of the 82 reports.
We also provide several other files in this dataset. We collected inclusion criteria (the criteria that make a PSR eligible to be included in an SRR) and recorded them in the file systematic_review_inclusion_criteria.csv. We provide a file (potential_inclusion_link.csv) recording whether a given PSR had been published as of the search date of a given SRR, which makes the PSR potentially eligible for inclusion in the SRR. We also provide a bibliography of the 82 publications (supplementary_reference_list.pdf). Lastly, we discovered minor discrepancies between the inclusion relationships identified by Trinquart et al. (2016) and by us. Therefore, we prepared an additional edge list (inclusion_net_edges_trinquart.csv) to preserve the inclusion relationships identified by Trinquart et al. (2016).
<b>UPDATES IN THIS VERSION COMPARED TO V2</b> (Fu, Yuanxi; Hsiao, Tzu-Kun; Joshi, Manasi Ballal (2022): The Salt Controversy Systematic Review Reports and Primary Study Reports Network Dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-6128763_V2)
- We added a new column "pub_date" to report_list.csv
- We corrected mistakes in supplementary_reference_list.pdf for report #28 and report #80. The author of report #28 is not Salisbury D but Khaw, K.-T., & Barrett-Connor, E. Report #80 was mistakenly mixed up with report #81.
keywords:
systematic reviews; evidence synthesis; network analysis; public health; salt controversy;
published:
2025-02-06
Ward, Michael; Tyndel, Stephen; Sperry, Jinelle; Katz, Aron
(2025)
Data from a study on the behavior of blue-winged and golden-winged warblers. We were investigating vocalizations and how the species reconizes each other. There are banding, behavioral data from a playback study, and song data.
keywords:
warblers; songs; species recognition
published:
2025-10-10
Clark, Teresa J.; Schwender, Jorg
(2025)
Upregulation of triacylglycerols (TAGs) in vegetative plant tissues such as leaves has the potential to drastically increase the energy density and biomass yield of bioenergy crops. In this context, constraint-based analysis has the promise to improve metabolic engineering strategies. Here we present a core metabolism model for the C4 biomass crop Sorghum bicolor (iTJC1414) along with a minimal model for photosynthetic CO2 assimilation, sucrose and TAG biosynthesis in C3 plants. Extending iTJC1414 to a four-cell diel model we simulate C4 photosynthesis in mature leaves with the principal photo-assimilatory product being replaced by TAG produced at different levels. Independent of specific pathways and per unit carbon assimilated, energy content and biosynthetic demands in reducing equivalents are about 1.3 to 1.4 times higher for TAG than for sucrose. For plant generic pathways, ATP- and NADPH-demands per CO2 assimilated are higher by 1.3- and 1.5-fold, respectively. If the photosynthetic supply in ATP and NADPH in iTJC1414 is adjusted to be balanced for sucrose as the sole photo-assimilatory product, overproduction of TAG is predicted to cause a substantial surplus in photosynthetic ATP. This means that if TAG synthesis was the sole photo-assimilatory process, there could be an energy imbalance that might impede the process. Adjusting iTJC1414 to a photo-assimilatory rate that approximates field conditions, we predict possible daily rates of TAG accumulation, dependent on varying ratios of carbon partitioning between exported assimilates and accumulated oil droplets (TAG, oleosin) and in dependence of activation of futile cycles of TAG synthesis and degradation. We find that, based on the capacity of leaves for photosynthetic synthesis of exported assimilates, mature leaves should be able to reach a 20% level of TAG per dry weight within one month if only 5% of the photosynthetic net assimilation can be allocated into oil droplets. From this we conclude that high TAG levels should be achievable if TAG synthesis is induced only during a final phase of the plant life cycle.
keywords:
Feedstock Production;Modeling
published:
2022-03-09
Rapti, Zoi; Rivera Quinones, Vanessa; Stewart Merrill, Tara
(2022)
MATLAB files for the analysis of an ODE model for disease transmission. The codes may be used to find equilibrium points, study transient dynamics, evaluate the basic reproductive number (R0), and simulate the model when parameters depend on the independent variables. In addition, the codes may be used to perform local sensitivity analysis of R0 on the model parameters.
published:
2019-10-16
Human annotations of randomly selected judged documents from the AP 88-89, Robust 2004, WT10g, and GOV2 TREC collections. Seven annotators were asked to read documents in their entirety and then select up to ten terms they felt best represented the main topic(s) of the document. Terms were chosen from among a set sampled from the document in question and from related documents.
keywords:
TREC; information retrieval; document topicality; document description
published:
2019-09-17
Fraebel, David T.; Kuehn, Seppe
(2019)
BAM files for evolved strains from migration rate selection experiments conducted in low viscosity (0.2% w/v) agar plates containing M63 minimal medium with 1mM of mannose, melibiose, N-acetylglucosamine or galactose
published:
2019-07-04
Sashittal, Palash; El-Kebir, Mohammed
(2019)
Results generated using SharpTNI on data collected from the 2014 Ebola outbreak in Sierra Leone.
published:
2019-12-03
These are the alignments of transcriptome data used for the analysis of members of Heteroptera. This dataset is analyzed in "Deep instability in the phylogenetic backbone of Heteroptera is only partly overcome by transcriptome-based phylogenomics" published in Insect Systematics and Diversity.
keywords:
Heteroptera; Hemiptera; Phylogenomics; transcriptome
published:
2023-03-27
Littlefield, Alexander; Xie, Dajie; Richards, Corey; Ocier, Christian; Gao, Haibo; Messinger, Jonah; Ju, Lawrence; Gao, Jingxing; Edwards, Lonna; Braun, Paul; Goddard, Lynford
(2023)
This dataset contains the full data used in the paper titled "Enabling High Precision Gradient Index Control in Subsurface Multiphoton Lithography," available at https://doi.org/10.1021/acsphotonics.2c01950 .
The data used for Table 1 can be found in the dataset for the related Figure 8.
Some supplemental figures' data can be found in the main figures data:
Figure S2's data is contained in Figure 6.
Figure S4 and Table S1 data is derived from Figure 6.
Figure S9 is derived from Figure 7.
Figure S10 is contained in Figure 7.
Figure S12 is derived from Figure 6 and the Python code prism-fringe-analysis.
Figures without a data file named after them do not have any data affiliated with them and are purely graphical representations.
published:
2021-11-23
Riemer, Nicole; Yao, Yu; Dawson, Matthew; Dabdub, Donald
(2021)
This dataset contains simulation results from PartMC-MOSAIC-CAPRAM used in the article ”Eval- uating the impacts of cloud processing on resuspended aerosol particles after cloud evaporation using a particle-resolved model”.
In this V2, there are eight folders: one for urban plume simulation to provide the initial particle population for cloud processing, the other four folders are for the four cloud cycles simulated and the last two are for the coagulation cases. Within the urban plume simulation, there are 25 NetCDF files hourly output from PartMC-MOSAIC simulations containing the gas and particle information. Within the four cloud cycle folders, there are 25 subdirectories that contain the cloud processing results for aerosol population from urban plume environment. For each subdirectory, there are 31 NetCDF files out- put every minute from PartMC-MOSAIC-CAPRAM simulations containing aerosol and gas information after aqueous chemistry. Another two folders are for the cases considering Brownian coagulation and sedimentation coalescence. Each contained 93 NetCDF files, produced from repeating the 30-minutes simulations for three times to consider the coagulation randomness. The low polluted case folder includes the simulated cloud processing results for 25 urban plume cases with less aerosol number concentration. This dataset was used to investigate the effects of cloud processing on aerosol mixing state and CCN properties.
keywords:
cloud process; coagulation; aqueous chemistry; aerosol mixing state; CCN
published:
2021-02-18
Wang, Shaowen; Lyu, Fangzheng; Wang, Shaohua; Catlet, Charles; Padmanabhan, Anand; Soltani, Kiumars
(2021)
Increasingly pervasive location-aware sensors interconnected with rapidly advancing wireless network services are motivating the development of near-real-time urban analytics. This development has revealed both tremendous challenges and opportunities for scientific innovation and discovery. However, state-of-the-art urban discovery and innovation are not well equipped to resolve the challenges of such analytics, which in turn limits new research questions from being asked and answered. Specifically, commonly used urban analytics capabilities are typically designed to handle, process, and analyze static datasets that can be treated as map layers and are consequently ill-equipped in (a) resolving the volume and velocity of urban big data; (b) meeting the computing requirements for processing, analyzing, and visualizing these datasets; and (c) providing concurrent online access to such analytics. To tackle these challenges, we have developed a novel cyberGIS framework that includes computationally reproducible approaches to streaming urban analytics. This framework is based on CyberGIS-Jupyter, through integration of cyberGIS and real-time urban sensing, for achieving capabilities that have previously been unavailable toward helping cities solve challenging urban informatics problems.
The files included in this dataset functions as follows:
1) Spatial_interpolation.ipynb is a python based Jupyter notebook that enables users to conduct spatial interpolation with AoT data;
2) Urban_Informatics.ipynb is a Jupyter notebook that helps to explore the AoT dataset;
3) chicago-complete.weekly.2019-09-30-to-2019-10-06.tar includes all the high-frequency urban sensing data from AoT sensors from 2019 September 30th to 2019 October 6th collected in Chicago, US;
4) sensors.csv is a processed dataset including information about the temperature in Chicago, and it is used in Spatial_interpolation.ipynb.
keywords:
CyberGIS; Urban informatics; Array of Things
published:
2020-10-20
Romero, Ingrid; Urban, Michael A.; Punyasena, Surangi
(2020)
This dataset includes a total of 501 images of 42 fossil specimens of Striatopollis and 459 specimens of 45 extant species of the tribe Amherstieae-Fabaceae. These images were taken using Airyscan confocal superresolution microscopy at 630X magnification (63x/NA 1.4 oil DIC). The images are in the CZI file format. They can be opened using Zeiss propriety software (Zen, Zen lite) or in ImageJ. More information on how to open CZI files can be found here: [https://www.zeiss.com/microscopy/us/products/microscope-software/zen/czi.html#microscope---image-data].
keywords:
Striatopollis catatumbus; superresolution microscopy; Cenozoic; tropics; Zeiss; CZI; striate pollen.
published:
2019-10-18
Supporting secondary data used in a manuscript currently in submission regarding the invasion dynamics of the asian tiger mosquito, Aedes albopictus, in the state of Illinois
keywords:
albopictus;mosquito
published:
2025-01-15
Suski, Cory; Hay, Allison
(2025)
Data was generated from acoustic transmitters implanted in tournament caught and non-angled control largemouth bass across multiple seasons. This data was used to quantify post-release movement, behavior, and mortality in response to angling tournaments at different times of year and varying water temperatures.
published:
2025-07-30
Skorupa, A. J.; Bried, J. T.
(2025)
This dataset includes three data files for linking species' climate sensitivity, trait combinations, and listing status. It contains species occurrence data within Hydrologic Unit Code 12 (HUC12) watersheds, along with trait information and Rarity and Climate Sensitivity (RCS) index scores for lotic caddisflies, stoneflies, mussels, dragonflies, and crayfish across all Midwest Climate Adaptation Science Center states: Minnesota, Iowa, Missouri, Wisconsin, Illinois, Indiana, Michigan, and Ohio. For mussels, the geographic scope is expanded to include all Midwest Regional Species of Greatest Conservation Need (RSGCN) states—North Dakota, South Dakota, Nebraska, Kansas, and Kentucky. However, occurrence data for mussels is not included due to data-sharing agreements. Metadata are included with each data file. Please refer to the associated manuscript for original data sources, trait references, and details on the RCS index calculation.
keywords:
climate sensitivity; conservation status; traits; aquatic invertebrates; Midwest
published:
2020-10-15
Khanna, Madhu; Wang, Weiwei; Wang, Michael
(2020)
This dataset consists of various input data that are used in the GAMS model. All the data are in the format of .inc which can be read within GAMS or Notepad. Main data sources include: acreage data (acre), crop budget data ($/acre), crop yield data (e.g. bushel/acre), Soil carbon sequestration data (KgCO2/ha/yr). Model details can be found in the "Assessing the Additional Carbon Savings with Biofuel" and GAMS model package.
## File Description
(1) GAMS Model.zip: This includes all the input files and scripts for running the model
(2) Table*.csv: These files include the data from the tables in the manuscript
(3) Figure2_3_4.csv: This contains the data used to create the figures in the manuscript
(4) BaselineResults.csv: This includes a summary of the model results.
(5) SensitivityResults_*.csv: Model results from the various sensitivity analyses performed
(6) LUC_emission.csv: land use change emissions by crop reporting district for changes of pasturelands to annual crops.
keywords:
Biogenic carbon intensity; Corn ethanol; Economic model; Dynamic optimization; Anticipated baseline approach; Life cycle carbon intenisty
published:
2019-07-08
Krichels, Alexander
(2019)
These files contain the data presented in the manuscript entitles "Iron redox reactions can drive microtopographic variation in upland soil carbon dioxide and nitrous oxide emissions".
keywords:
Iron; redox; carbon dioxide; nitrous oxide; chemodenitrification; Feammox; dissimilatory iron reduction; upland soils; flooding; global change
published:
2024-03-25
Xia, Yushu; Kwon, Hoyoung; Wander, Michelle
(2024)
This accompanying study is published under the title "Estimating soil N2O emissions induced by organic and inorganic fertilizer inputs using a Tier-2, regression-based meta-analytic approach for U.S. agricultural lands" at Science of the Total Environment. The study is authored by Dr. Yushu Xia, Dr. Hoyoung Kwon, and Dr. Michelle Wander. The DOI for this study is <a href="https://doi.org/10.1016/j.scitotenv.2024.171930">https://doi.org/10.1016/j.scitotenv.2024.171930</a>.
keywords:
soil; nitrous oxide; agriculture; fertilizers; meta-analysis
published:
2019-05-20
Lao, Yuyang; Schiffer, Peter
(2019)
This is the experimental data of tetris artificial spin ice. The islands are made of Permalloy materials with size of 170 nm by 470 nm by 2.5 nm. The systems are measured at a temperature where the islands are fluctuating around room temperature. The data is recorded as photoemission electron microscopy intensity. More details about the dataset can be found in the file Note.txt and Tetris_data_list.xlsx
Note:
2 files name bl11_teris600_033 and bl11_tetris600_2_135 are not recorded in the excel sheet because they are corrupted during the measurement. Any data that is not recorded in the excel sheet is either corrupted or of low quality.
From files *_028 to *_049, tetris is spelled with “t” while in the raw data folder without “t”. This is a typo. Throughout the dataset, tetris and teris are supposed to have the same meaning.
keywords:
artificial spin ice
published:
2022-02-11
Hoang, Khanh Linh; Schneider, Jodi; Kansara, Yogeshwar
(2022)
The data contains a list of articles given low score by the RCT Tagger and an error analysis of them, which were used in a project associated with the manuscript "Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews".
Change made in this V3 is that the data is divided into two parts:
- Error Analysis of 44 Low Scoring Articles with MEDLINE RCT Publication Type.
- Error Analysis of 244 Low Scoring Articles without MEDLINE RCT Publication Type.
keywords:
Cochrane reviews; automation; randomized controlled trial; RCT; systematic reviews
published:
2019-08-05
Skinner, Rachel; Dietrich, Christopher; Walden, Kimberly; Gordon, Eric; Sweet, Andrew; Podsiadlowski, Lars; Petersen, Malte; Simon, Chris; Takiya, Daniela; Johnson, Kevin
(2019)
The data in this directory corresponds to:
Skinner, R.K., Dietrich, C.H., Walden, K.K.O., Gordon, E., Sweet, A.D., Podsiadlowski, L., Petersen, M., Simon, C., Takiya, D.M., and Johnson, K.P.
Phylogenomics of Auchenorrhyncha (Insecta: Hemiptera) using Transcriptomes: Examining Controversial Relationships via Degeneracy Coding and Interrogation of Gene Conflict.
Systematic Entomology.
Correspondance should be directed to: Rachel K. Skinner, rskinn2@illinois.edu
If you use these data, please cite our paper in Systematic Entomology.
The following files can be found in this dataset:
Amino_acid_concatenated_alignment.phy: the amino acid alignment used in this analysis in phylip format.
Amino_acid_raxml_partitions.txt (for reference only): the partitions for the amino acid alignment, but a partitioned amino acid analysis was not performed in this study.
Amino_acid_concatenated_tree.newick: the best maximum likelihood tree with bootstrap values in newick format.
ASTRAL_input_gene_trees.tre: the concatenated gene tree input file for ASTRAL
README_pie_charts.md: explains the the scripts and data needed to recreate the pie charts figure from our paper. There is also another
Corresponds to the following files:
ASTRAL_species_tree_EN_only.newick: the species tree with only effective number (EN) annotation
ASTRAL_species_tree_pp1_only.newick: the species tree with only the posterior probability 1 (main topology) annotation
ASTRAL_species_tree_q1_only.newick: the species tree with only the quartet scores for the main topology (q1)
ASTRAL_species_tree_q2_only.newick: the species tree with only the quartet scores for the first alternative topology (q2)
ASTRAL_species_tree_q3_only.newick: the species tree with only the quartet scores for the second alternative topology (q3)
print_node_key_files.py: script needed to create the following files:
node_keys.key: text file with node IDs and topologies
complete_q_scores.key: text file with node IDs multiplied q scores
EN_node_vals.key: text file with node IDs and EN values
create_pie_charts_tree.py: script needed to visualize the tree with pie charts, pp1, and EN values plotted at nodes
ASTRAL_species_tree_full_annotation.newick: the species tree with full annotation from the ASTRAL analysis.
NOTE: It may be more useful to examine individual value files if you want to visualize the tree,
e.g., in figtree, since the full annotations are extensive and can make viewing difficult.
Complete_NT_concatenated_alignment.phy: the nucleotide alignment that includes unmodified third codon positions. The alignment is in phylip format.
Complete_NT_raxml_partitions.txt: the raxml-style partition file of the nucleotide partitions
Complete_NT_concatenated_tree.newick: the best maximum likelihood tree from the concatenated complete analysis NT with bootstrap values in newick format
Complete_NT_partitioned_tree.newick: the best maximum likelihood tree from the partitioned complete NT analysis with bootstrap values in newick format
Degeneracy_coded_nt_concatenated_alignment.phy: the degeneracy coded nucleotide alignment in phylip format
Degeneracy_coded_nt_raxml_partitions.txt: the raxml-style partition file for the degeneracy coded nucleotide alignment
Degeneracy_coded_nt_concatenated_tree.newick: the best maximum likelihood tree from the degeneracy-coded concatenated analysis with bootstrap values in newick format
Degeneracy_coded_nt_partitioned_tree.newick: the best maximum likelihood tree from the degeneracy-coded partitioned analysis with bootstrap values in newick format
count_ingroup_taxa.py: script that counts the number of ingroup and/or outgroup taxa present in an alignment
keywords:
Auchenorrhyncha; Hemiptera; alignment; trees
published:
2025-10-10
Dong, Chang; Shi, Zhuwei; Huang, Lei; Zhao, Huimin; Xu, Zhinan; Lian, Jiazhang
(2025)
Mitochondrion is generally considered as the most promising subcellular organelle for compartmentalization engineering. Much progress has been made in reconstituting whole metabolic pathways in the mitochondria of yeast to harness the precursor pools (i.e., pyruvate and acetyl-CoA), bypass competing pathways, and minimize transportation limitations. However, only a few mitochondrial targeting sequences (MTSs) have been characterized (i.e., MTS of COX4), limiting the application of compartmentalization engineering for multigene biosynthetic pathways in the mitochondria of yeast. In the present study, based on the mitochondrial proteome, a total of 20 MTSs were cloned and the efficiency of these MTSs in targeting heterologous proteins, including the Escherichia coli FabI and enhanced green fluorescence protein (EGFP) into the mitochondria was evaluated by growth complementation and confocal microscopy. After systematic characterization, six of the well-performed MTSs were chosen for the colocalization of complete biosynthetic pathways into the mitochondria. As proof of concept, the full α-santalene biosynthetic pathway consisting of 10 expression cassettes capable of converting acetyl-coA to α-santalene was compartmentalized into the mitochondria, leading to a 3.7-fold improvement in the production of α-santalene. The newly characterized MTSs should contribute to the expanded metabolic engineering and synthetic biology toolbox for yeast mitochondrial compartmentalization engineering.
keywords:
Conversion;Metabolic Engineering