Illinois Data Bank Dataset Search Results
Results
published:
2018-10-17
Price, Edward; Spyreas, Greg; Matthews, Jeffrey
(2018)
This is the dataset used in the Ecological Applications publication of the same name. This dataset consists of the following files:
Internal.Community.Data.txt
Regional.Community.Data.txt
Site.Attributes.txt
Year.Of.Final.Bio.Monitoring.txt
Internal.Community.Data.txt is a site and plot by species matrix. Column labeled SITE consists of site IDs. Column labeled Plot consists of Plot numbers. All other columns represent species relative abundances per plot.
Regional.Community.Data.txt is a site by species matrix of relative abundances. Column labeled site consists of site IDs. All other columns represent species relative abundances per site.
Site.attributes.txt is a matrix of site attributes. Column labeled SITE consists of site IDs. Column labeled Long represents longitude in decimal degrees. Column labeled Lat represents latitude in decimal degrees. Column labeled Richness represents species richness of sites calculated from Regional Community Data. Column labeled NAT_COMP_REST represents designation as a randomly selected natural wetland (NAT), compensation wetland (COMP) or reference quality natural wetland (REF).
Column labeled HQ_LQ_COMP represents designation as high quality (HQ), low quality (LQ) or compensation wetland (COMP). Column labeled SAMPLING_YEAR_INTERNAL represents year data used for analysis of internal β-diversity was gathered. Column labeled SAMPLING_YEAR_REGIONAL represents year data used for analysis of regional β-diversity was gathered. Column labeled TRANSECT_LENGTH represents length in meters of initial sampling transect. INAI_GRADE represents Illinois Natural Areas Inventory grades assigned to each site. Grades range from A for highest quality natural areas to E for lowest quality natural areas.
Year.Of.Final.Bio.Monitoring.txt is a table representing years of final monitoring of compensation wetlands as mandated by the US Army Corps of Engineers. Column labeled Site consists of site IDs. Column labeled YR_FIN_BIO_MON consists of years of final monitoring. Entries of N/A represent dates that were unable to be located.
More information about this dataset: Interested parties can request data from the Critical Trends Assessment Program, which was the source for data on naturally occurring wetlands in this study. More information on the program and data requests can be obtained by visiting the program webpage. Critical Trends Assessment Program, Illinois Natural History Survey. http://wwx.inhs.illinois.edu/research/ctap/
keywords:
biodiversity; wetlands; wetland mitigation; biotic homogenization; beta diversity
published:
2018-12-04
Wang, Yang; Dietrich, Christopher; Zhang, Yalin
(2018)
The text file contains the original data used in the phylogenetic analyses of Wang et al. (2017: Scientific Reports 7:45387). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 81 taxa (species) and 2905 characters, indicate that the first 2805 characters are DNA sequence and the last 100 are morphological, that the data may be interleaved (with data for one species on multiple rows), that gaps inserted into the DNA sequence alignment are indicated by a dash, and that missing data are indicated by a question mark. The file contains aligned nucleotide sequence data for 5 gene regions and 100 morphological characters. The identity and positions of data partitions are indicated in the mrbayes block of commands for the phylogenetic program MrBayes at the end of the file. The mrbayes block also contains instructions for MrBayes on various non-default settings for that program. These are explained in the original publication. Descriptions of the morphological characters and more details on the species and specimens included in the dataset are provided in the supplementary document included as a separate pdf. The original raw DNA sequence data are available from NCBI GenBank under the accession numbers indicated in the supplementary file.
keywords:
phylogeny; DNA sequence; morphology; Insecta; Hemiptera; Cicadellidae; leafhopper; evolution; 28S rDNA; wingless; histone H3; cytochrome oxidase I; bayesian analysis
published:
2025-10-30
Cao, Dang Viet; Luo, Guangbin; Korynta, Shelby; Liu, Hui; Liang, Yuanxue; Shanklin, John; Altpeter, Fredy
(2025)
Metabolic engineering for hyperaccumulation of lipids in vegetative tissues is a novel strategy for enhancing energy density and biofuel production from biomass crops. Energycane is a prime feedstock for this approach due to its high biomass production and resilience under marginal conditions. DIACYLGLYCEROL ACYLTRANSFERASE (DGAT) catalyzes the last and only committed step in the biosynthesis of triacylglycerol (TAG) and can be a rate-limiting enzyme for the production of TAG. In this study, we explored the effect of intron-mediated enhancement (IME) on the expression of DGAT1 and resulting accumulation of TAG and total fatty acid (TFA) in leaf and stem tissues of energycane. To maximize lipid accumulation these evaluations were carried out by co-expressing the lipogenic transcription factor WRINKLED1 (WRI1) and the TAG protect factor oleosin (OLE1). Including an intron in the codon-optimized TmDGAT1 elevated the accumulation of its transcript in leaves by seven times on average based on 5 transgenic lines for each construct. Plants with WRI1 (W), DGAT1 with intron (Di), and OLE1 (O) expression (WDiO) accumulated TAG up to a 3.85% of leaf dry weight (DW), a 192-fold increase compared to non-modified energycane (WT) and a 3.8-fold increase compared to the highest accumulation under the intron-less gene combination (WDO). This corresponded to TFA accumulation of up to 8.4% of leaf dry weight, a 2.8-fold or 6.1-fold increase compared to WDO or WT, respectively. Co-expression of WDiO resulted in stem accumulations of TAG up to 1.14% of DW or TFA up to 2.08% of DW that exceeded WT by 57-fold or 12-fold and WDO more than twofold, respectively. Constitutive expression of these lipogenic “push pull and protect” factors correlated with biomass reduction. Intron-mediated enhancement (IME) of the expression of DGAT resulted in a step change in lipid accumulation of energycane and confirmed that under our experimental conditions it is rate limiting for lipid accumulation. IME should be applied to other lipogenic factors and metabolic engineering strategies. The findings from this study may be valuable in developing a high biomass feedstock for commercial production of lipids and advanced biofuels.
keywords:
Feedstock Production;Lipidomics;Metabolomics
published:
2018-04-23
Provides links to Author-ity 2009, including records from principal investigators (on NIH and NSF grants), inventors on USPTO patents, and students/advisors on ProQuest dissertations.
Note that NIH and NSF differ in the type of fields they record and standards used (e.g., institution names). Typically an NSF grant spanning multiple years is associated with one record, while an NIH grant occurs in multiple records, for each fiscal year, sub-projects/supplements, possibly with different principal investigators.
The prior probability of match (i.e., that the author exists in Author-ity 2009) varies dramatically across NIH grants, NSF grants, and USPTO patents. The great majority of NIH principal investigators have one or more papers in PubMed but a minority of NSF principal investigators (except in biology) have papers in PubMed, and even fewer USPTO inventors do. This prior probability has been built into the calculation of match probabilities.
The NIH data were downloaded from NIH exporter and the older NIH CRISP files. The dataset has 2,353,387 records, only includes ones with match probability > 0.5, and has the following 12 fields:
1 app_id,
2 nih_full_proj_nbr,
3 nih_subproj_nbr,
4 fiscal_year
5 pi_position
6 nih_pi_names
7 org_name
8 org_city_name
9 org_bodypolitic_code
10 age: number of years since their first paper
11 prob: the match probability to au_id
12 au_id: Author-ity 2009 author ID
The NSF dataset has 262,452 records, only includes ones with match probability > 0.5, and the following 10 fields:
1 AwardId
2 fiscal_year
3 pi_position,
4 PrincipalInvestigators,
5 Institution,
6 InstitutionCity,
7 InstitutionState,
8 age: number of years since their first paper
9 prob: the match probability to au_id
10 au_id: Author-ity 2009 author ID
There are two files for USPTO because here we linked disambiguated authors in PubMed (from Author-ity 2009) with disambiguated inventors.
The USPTO linking dataset has 309,720 records, only includes ones with match probability > 0.5, and the following 3 fields
1 au_id: Author-ity 2009 author ID
2 inv_id: USPTO inventor ID
3 prob: the match probability of au_id vs inv_id
The disambiguated inventors file (uiuc_uspto.tsv) has 2,736,306 records, and has the following 7 fields
1 inv_id: USPTO inventor ID
2 is_lower
3 is_upper
4 fullnames
5 patents: patent IDs separated by '|'
6 first_app_yr
7 last_app_yr
keywords:
PubMed; USPTO; Principal investigator; Name disambiguation
published:
2021-05-10
Zheng, Zhonghua; Zhao, Lei; Oleson, Keith
(2021)
This dataset contains the emulated global multi-model urban daily temperature projections under RCP 8.5 scenario. The dataset is derived from the study "Large model structural uncertainty in global projections of urban heat waves" (XXXX). Details about this dataset and the local urban climate emulator are described in the article. This dataset documents the global urban daily temperatures of 17 CMIP5 Earth system models for 2006-2015 and 2061-2070. This dataset may be useful for multiple communities regarding urban climate change, heat waves, impacts, vulnerability, risks, and adaptation applications.
keywords:
Urban heat waves; CMIP; urban warming; heat stress; urban climate change
published:
2025-10-10
Wu, Zong-Yen; Sun, Wan; Shen, Yihui; Pratas, Jimmy; Suthers, Patrick F.; Hsieh, Ping-Hung; Dwaraknath, Sudharsan; Rabinowitz, Joshua D.; Maranas, Costas D.; Shao, Zengyi; Yoshikuni, Yasuo
(2025)
Methyl methacrylate (MMA) is an important petrochemical with many applications. However, its manufacture has a large environmental footprint. Combined biological and chemical synthesis (semisynthesis) may be a promising alternative to reduce both cost and environmental impact, but strains that can produce the MMA precursor (citramalate) at low pH are required. A non-conventional yeast, Issatchenkia orientalis, may prove ideal, as it can survive extremely low pH. Here, we demonstrate the engineering of I. orientalis for citramalate production. Using sequence similarity network analysis and subsequent DNA synthesis, we selected a more active citramalate synthase gene (cimA) variant for expression in I. orientalis. We then adapted a piggyBac transposon system for I. orientalis that allowed us to simultaneously explore the effects of different cimA gene copy numbers and integration locations. A batch fermentation showed the genome-integrated-cimA strains produced 2.0 g/L citramalate in 48 h and a yield of up to 7% mol citramalate/mol consumed glucose. These results demonstrate the potential of I. orientalis as a chassis for citramalate production.
keywords:
Conversion;Metabolomics
published:
2025-09-18
Kurambhatti, Chinmay V.; Kumar, Deepak; Singh, Vijay
(2025)
Use of corn fractionation techniques in dry grind process increases the number of coproducts, enhances their quality and value, generates feedstock for cellulosic ethanol production and potentially increases profitability of the dry grind process. The aim of this study is to develop process simulation models for eight different wet and dry corn fractionation techniques recovering germ, pericarp fiber and/or endosperm fiber, and evaluate their techno-economic feasibility at the commercial scale. Ethanol yields for plants processing 1113.11 MT corn/day were 37.2 to 40 million gal for wet fractionation and 37.3 to 31.3 million gal for dry fractionation, compared to 40.2 million gal for conventional dry grind process. Capital costs were higher for wet fractionation processes ($92.85 to $97.38 million) in comparison to conventional ($83.95 million) and dry fractionation ($83.35 to $84.91 million) processes. Due to high value of coproducts, ethanol production costs in most fractionation processes ($1.29 to $1.35/gal) were lower than conventional ($1.36/gal) process. Internal rate of return for most of the wet (6.88 to 8.58%) and dry fractionation (6.45 to 7.04%) processes was higher than the conventional (6.39%) process. Wet fractionation process designed for germ and pericarp fiber recovery was most profitable among the processes.
keywords:
Conversion;Feedstock Bioprocessing;Modeling
published:
2025-10-17
Mou, Quanbing; Xue, Xueyi; Ma, Yuan; Banik, Mandira; Garcia, Valeria; Guo, Weijie; Wang, Jiang; Song, Tingjie; Chen, Li-Qing; Lu, Yi
(2025)
DNA aptamers have been widely used as biosensors for detecting a variety of targets. Despite decades of success, they have not been applied to monitor any targets in plants, even though plants are a major platform for providing oxygen, food, and sustainable products ranging from energy fuels to chemicals, and high-value products such as pharmaceuticals. A major barrier to progress is a lack of efficient methods to deliver DNA into plant cells. We herein report a thiol-mediated uptake method that more efficiently delivers DNA into Arabidopsis and tobacco leaf cells than another state-of-the-art method, DNA nanostructures. Such a method allowed efficient delivery of a glucose DNA aptamer sensor into Arabidopsis for sensing glucose. This demonstration opens a new avenue to apply DNA aptamer sensors for functional studies of various targets, including metabolites, plant hormones, metal ions, and proteins in plants for a better understanding of the biodistribution and regulation of these species and their functions.
keywords:
Conversion;Feedstock Production;Genomics
published:
2018-12-06
Krishnankutty, Sindhu; Dietrich, Christopher; Dai, Wu; Siddappaji, Madhura
(2018)
The text file contains the original DNA sequence data used in the phylogenetic analyses of Krishnankutty et al. (2016: Systematic Entomology 41: 580–595). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The file contains five separate data blocks, one for each character partition (28S, histone H3, 12S, indels, and morphology) for 53 taxa (species). Gaps inserted into the DNA sequence alignment are indicated by a dash, and missing data are indicated by a question mark. The separate "indels1" block includes 40 indels (insertions/deletions) from the 28S sequence alignment re-coded using the modified complex indel coding scheme, as described in the "Materials and methods" of the original publication. The DIMENSIONS statements near the beginning of each block indicate the numbers of taxa (NTax) and characters (NChar). The file contains aligned nucleotide sequence data for 3 gene regions and 40 morphological characters. The file is configured for use with the maximum likelihood-based phylogenetic program GARLI but can also be parsed by any other bioinformatics software that supports the NEXUS format. Descriptions of the morphological characters and more details on the species and specimens included in the dataset are provided in the supplementary document included as a separate pdf. The original raw DNA sequence data are available from NCBI GenBank under the accession numbers indicated in the supporting pdf file. More details on individual analyses are provided in the original publication.
keywords:
phylogeny; DNA sequence; morphology; Insecta; Hemiptera; Cicadellidae; leafhopper; evolution; 28S rDNA; histone H3; 12S mtDNA; maximum likelihood
published:
2022-09-07
Long, Stephen P.; Wang, Yu; Stutz, Samantha S.
(2022)
We developed a new application of isotopic gas exchange which couples a tunable diode laser absorption spectroscope (TDL) with a leaf gas exchange system, analyzing leakiness through induction of C4 photosynthesis on dark to high-light transitions. The youngest fully expanded leaf was measured on 40-45 day-old maize(B73) and sorghum (Tx430).
Detail definition of each variable in raw Li-6400XT and Li-6800 (in "Original_data_AND_Data_processing_code.zip") is summarized in: <a href="https://www.licor.com/env/support/LI-6800/topics/symbols.html#const">https://www.licor.com/env/support/LI-6800/topics/symbols.html#const</a>
keywords:
leakiness; bundle sheath leakage; C4 photosynthesis; photosynthetic induction; non-steady-state photosynthesis; carbon isotope discrimination; photosynthetic efficiency; corn
published:
2022-11-07
Sweedler, Jonathan; Castro, Daniel
(2022)
The dataset contains the data and code for Single-cell and Subcellular Analysis of freshly isolated cultured, uncultured P1 cells and uncultured Old cells. The .csv file named 'MagLab20220721' contains the sample and intensity information with the columns referring to the m/z values and the rows being the samples. The 'MagLabNameINdex.csv' file contains all the index information. The file named '20220721_MagLab.spydata' contains the loaded data of both the two previous files in Spyder. The .mat file contains the aligned data for the three groups.
keywords:
Single-cell; Subcellular; Mass Spectrometry; MALDI; Lipidomics; FTICR; 21 T
published:
2025-11-26
Maitra, Shraddha; Singh, Vijay
(2025)
5-hydroxymethyl furfural (HMF) and furfurals are DOE-listed platform chemicals that can be derived from the renewable carbon in the lignocellulosic biomasses and have the potential to replace petroleum-derived alternatives. High substrate cost and use of expensive solvents limit the economic feasibility of bio-based HMF production on an industrially relevant scale. The study presents an experimental optimized condition that maximizes the chemical-free production of HMF and furfurals without lowering the yield of total fermentable sugars from Saccharum bagasse. Hydrothermal pretreatment at 210 °C for 15 min yielded approximately 10%, 12%, and 46% of HMF, furfurals, and fermentable sugars per gram of dry biomass, respectively. Additionally, the study proposes a consolidated bioprocess model to produce and recover four high-value bioproducts i.e., HMF, furfurals, ethanol, and acetic acid based on the experimental results and evaluates its technoeconomic feasibility considering HMF as the main product. The minimum selling price (MSP) of HMF was estimated to be 930.6 USD/t which is competitive with its petroleum-derived precursor alternative p-xylene (1,113 USD/t). The sensitivity analysis performed for the process parameters suggests that pretreatment cost and revenues from coproducts immensely influence the MSP of HMF. The preliminary technoeconomic analysis performed on the consolidated bioprocess design indicates that additional revenue streams from diversified coproducts in biorefineries aid in lowering the MSP of high-value bioproducts.
keywords:
Conversion;Economics
published:
2021-01-04
Zhao, Lei; Oleson, Keith; Bou-Zeid, Elie; Krayenhoff, Eric Scott; Bray, Andrew; Zhu, Qing; Zheng, Zhonghua; Chen, Chen; Oppenheimer, Michael
(2021)
This dataset contains the emulated global multi-model urban climate projections under RCP 8.5 and RCP 4.5 used in the article "Global multi-model projections of local urban climates" (https://www.nature.com/articles/s41558-020-00958-8). Details about this dataset and the local urban climate emulator are described in the article. This dataset documents the monthly mean projections of urban temperatures and urban relative humidity of 26 CMIP5 Earth system models (ESMs) from 2006 to 2100 across the globe. This dataset may be useful for multiple communities regarding urban climate change, impacts, vulnerability, risks, and adaptation applications.
keywords:
Urban climate; multi-model climate projections; CMIP; urban warming; heat stress
published:
2022-10-10
Varela, Sebastian; Leakey, Andrew; Sacks, Erik
(2022)
Aerial imagery utilized as input in the manuscript "Deep convolutional neural networks exploit high spatial and temporal resolution aerial imagery to predict key traits in miscanthus" . Data was collected over M. Sacchariflorus and Sinensis breeding trials at the Energy Farm, UIUC in 2020. Flights were performed using a DJI M600 mounted with a Micasense Rededge multispectral sensor at 20 m altitude around solar noon. Imagery is available as tif file by field trial and date (10). The post-processing of raw images into orthophoto was performed in Agisoft Metashape software. Each crop surface model and multispectral orthophoto was stacked into an unique raster stack by date and uploaded here. Each raster stack includes 6 layers in the following order: Layer 1 = crop surface model, Layer 2 = Blue, Layer 3 = Green, Layer 4 = Red, Layer 5 = Rededge, and Layer 6 = NIR multispectral bands. Msa raster stacks were resampled to 1.67 cm spatial resolution and Msi raster stacks were resampled to 1.41 cm spatial resolution to ease their integration into further analysis. 'MMDDYYYY' is the date of data collection, 'MSA' is M. Sacchariflorus trial, 'MSI' is Miscanthus Sinensis trial, 'CSM' is crop surface model layer, and 'MULTSP' are the five multispectral bands.
keywords:
convolutional neural networks; miscanthus; perennial grasses; bioenergy; field phenotyping; remote sensing; UAV
published:
2023-12-13
Corbicula spp. are one of the most prolific aquatic invasive species in the world and can have negative effects on aquatic ecosystems. We performed qualitative field surveys, examined literature accounts and natural history museum holdings, and accessed citizen science data sources to document the distribution of Corbicula in Mexico and shared drainages. Through 26 publications (N = 127 records), 312 museum holdings, and 446 iNaturalist records, we documented 885 records pertaining to Corbicula in Mexico and shared drainages. The first record of the species in Mexico was in 1969, and it has since been reported from 26 of the 32 Mexican states and most of the major river basins throughout the country. However, we suggest Corbicula is more prevalent in Mexico than we report in this work as it is often under sampled / under reported.
keywords:
Corbicula; exotic species; invasive species; Asian Clams; Bivalvia; freshwater systems
published:
2025-09-29
Guo, Zhihui; Xu, Meilan; Nagano, Hironori; Clark, Lindsay; Sacks, Erik; Yamada, Toshihiko
(2025)
The optimal flowering time for bioenergy crop miscanthus is essential for environmental adaptability and biomass accumulation. However, little is known about how genes controlling flowering in other grasses contribute to flowering regulation in miscanthus. Here, we report on the sequence characterization and gene expression of Miscanthus sinensisGhd8, a transcription factor encoding a HAP3/NF-YB DNA-binding domain, which has been identified as a major quantitative trait locus in rice, with pleiotropic effects on grain yield, heading date and plant height. In M. sinensis, we identified two homoeologous loci, MsiGhd8A located on chromosome 13 and MsiGhd8B on chromosome 7, with one on each of this paleo-allotetraploid species’ subgenomes. A total of 46 alleles and 28 predicted protein sequence types were identified in 12 wild-collected accessions. Several variants of MsiGhd8 showed a geographic and latitudinal distribution. Quantitative real-time PCR revealed that MsiGhd8 expressed under both long days and short days, and MsiGhd8B showed a significantly higher expression than MsiGhd8A. The comparison between flowering time and gene expression indicated that MsiGhd8B affected flowering time in response to day length for some accessions. This study provides insight into the conserved function of Ghd8 in the Poaceae, and is an important initial step in elucidating the flowering regulatory network of Miscanthus.
keywords:
Feedstock Production;Genomics
published:
2025-10-13
Namoi, Nictor; Jang, Chunhwa; Robins, Zachary; Lin, Cheng-Hsien; Lim, Soo-Hyun; Voigt, Thomas; Lee, DoKyoung
(2025)
Miscanthus × giganteus (Miscanthus) is a warm-season perennial grass grown for bioenergy feedstock production. Nitrogen (N) fertilizer management is crucial for the sustainability of Miscanthus production. In our two-year study (2018 and 2019), we investigated the role of vegetation indices (VIs) in evaluating N fertilization (0 N, 56 N, 112 N, and 168 N kg ha−1) impacts on Miscanthus biomass yield and stand health. The flight campaigns were conducted early, middle, and late during the summer growing season. Among the VIs, mid-summer growing season NDRE provided the best prediction of fresh biomass (R2 = 0.87 and 0.97) and dry biomass (R2 = 0.89 and 0.97) in 2018 and 2019, respectively. The VIs generally showed that it was possible to distinguish between 0 N and 168 N treatments, but neither 0 N and 56 N kg ha−1 nor 112 N and 168 N kg ha−1 could be separated. The results from this study highlight the importance of moderate application of N (112 kg N ha−1) in improving and maintaining the stand health and biomass yield of Miscanthus over time and suggest that mid-summer growing season VIs, NDRE in particular, can be useful for assessment of Miscanthus stand health and biomass yield.
keywords:
Feedstock Production;Biomass Analytics;Field Data
published:
2025-05-29
Ruess, P.J.; Hanley, Jackie; Konar, Megan
(2025)
These data support Ruess et al (2025) "Drought impacts to water footprints and virtual water transfers of counties of the United States", Water Resources Research, 61, e2024WR037715, https://doi.org/10.1029/2024WR037715.
The dataset contains estimates for Virtual Water Content (VWC) and Virtual Water Trade (VWT) for nine unique combinations of three crop categories (cereal grains, produce, and animal feed) and three water sources (surface water withdrawals, groundwater withdrawals, and groundwater depletion) for the years 2012 and 2017 within the Continental United States. The VWC is calculated by dividing irrigation withdrawal estimates (m3) by the production (tons) at the county resolution. The VWT is calculated by multiplying the VWC by the estimated county level food flows (tons) from Karakoc et al. (2022). All VWC estimates are provided at the county resolution according to county GEOID and are given in units of m3/ton. All VWT estimates are given in pairs of origin and destination GEOID’s and provided in units of m3.
When using, please cite as:
Ruess, P.J., Hanley, J., and Konar, M. (2025) "Drought impacts to water footprints and virtual water transfers of counties of the United States", Water Resources Research, 61, e2024WR037715, doi: 10.1029/2024WR037715.
keywords:
irrigation; water footprints; supply chains
published:
2025-10-17
Cao, Mingfeng; Tran, Vinh G.; Qin, Jiansong; Olson, Andrew; Mishra, Shekhar; Schultz, J. Carl; Huang, Chunshuai; Xie, Dongming; Zhao, Huimin
(2025)
The plant-sourced polyketide triacetic acid lactone (TAL) has been recognized as a promising platform chemical for the biorefinery industry. However, its practical application was rather limited due to low natural abundance and inefficient cell factories for biosynthesis. Here, we report the metabolic engineering of oleaginous yeast Rhodotorula toruloides for TAL overproduction. We first introduced a 2-pyrone synthase gene from Gerbera hybrida (GhPS) into R. toruloides and investigated the effects of different carbon sources on TAL production. We then systematically employed a variety of metabolic engineering strategies to increase the flux of acetyl-CoA by enhancing its biosynthetic pathways and disrupting its competing pathways. We found that overexpression of ATP-citrate lyase (ACL1) improved TAL production by 45% compared to the GhPS overexpressing strain, and additional overexpression of acetyl-CoA carboxylase (ACC1) further increased TAL production by 29%. Finally, we characterized the resulting strain I12-ACL1-ACC1 using fed-batch bioreactor fermentation in glucose or oilcane juice medium with acetate supplementation and achieved a titer of 28 or 23 g/L TAL, respectively. This study demonstrates that R. toruloides is a promising host for the production of TAL and other acetyl-CoA-derived polyketides from low-cost carbon sources.
keywords:
Conversion;Metabolic Engineering
published:
2019-03-25
Clark, Lindsay V.; Dwiyanti, Maria Stefanie; Anzoua, Kossonou G.; Brummer, Joe E.; Ghimire, Bimal Kumar; Głowacka, Katarzyna; Hall, Megan; Heo, Kweon; Jin, Xiaoli; Lipka, Alexander E.; Peng, Junhua; Yamada, Toshihiko; Yoo, Ji Hye; Yu, Chang Yeon; Zhao, Hua; Long, Stephen P.; Sacks, Erik J.
(2019)
This dataset contains genotypic and phenotypic data, R scripts, and the results of analysis pertaining to a multi-location field trial of Miscanthus sinensis. Genome-wide association and genomic prediction were performed for biomass yield and 14 yield-component traits across six field trial locations in Asia and North America, using 46,177 single-nucleotide polymorphism (SNP) markers mined from restriction site-associated DNA sequencing (RAD-seq) and 568 M. sinensis accessions. Genomic regions and candidate genes were identified that can be used for breeding improved varieties of M. sinensis, which in turn will be used to generate new M. xgiganteus clones for biomass.
keywords:
miscanthus; genotyping-by-sequencing (GBS); genome-wide association studies (GWAS); genomic selection
published:
2024-08-16
Halligan, Susannah; Schummer, Michael; Fournier, Auriel; Musni, Vergie; Davis, J. Brian; Downs, Cynthia; Lavretsky, Philip
(2024)
Dataset used for the paper entitled "Morphological differences between wild and game-farm Mallards in North America".
Large-scale releases of domesticated, game-farm Mallards to supplement wild populations have resulted in wide-spread introgressive hybridization that changed the genetic constitution of wild populations in eastern North America. The resulting gene flow is well-documented between game-farm and wild Mallards, but the mechanistic consequences from such interactions remain unknown in North America. We provide the first study to characterize and investigate potential differences in morphology between genetically known, wild and game-farm Mallards in North America. We used nine morphological measurements to discriminate between wild and game-farm Mallards with 96% accuracy. Compared to their wild counterparts, game-farm Mallards had longer bodies and tarsi, shorter heads and wings, and shorter, wider, and taller bills. The nail on the end of the bill of game-farm Mallards was longer, and game-farm Mallard bills had a greater lamellae:bill length ratio than wild Mallards. Differences in body morphologies between wild and game-farm Mallards are consistent with an artificial, terrestrial life whereby game-farm Mallards are fed pelleted foods resulting in artificial selection for a more “goose-like” bill. We posit that 1) game-farm Mallards have diverged from their wild ancestral traits of flying and filter feeding towards becoming optimized to run and peck for food; 2) game-farm morphological traits optimized over the last 400 years in domestic environments are likely to be maladaptive in the wild; and 3) the introgression of such traits into wild populations is likely to reduce fitness. Understanding effects of game-farm Mallard introgression requires analysis of various game-farm × wild hybrid generations to determine how domestically-derived traits persist or diminish with each generation.
keywords:
Mallard; Game Farm; Morphology; Waterfowl; Duck
published:
2025-09-29
Li, Shuai; Moller, Christopher; Mitchell, Noah G.; Lee, DoKyoung; Ainsworth, Elizabeth
(2025)
Elevated tropospheric ozone concentration (O3) significantly reduces photosynthesis and productivity in several C4 crops including maize, switchgrass and sugarcane. However, it is unknown how O3 affects plant growth, development and productivity in sorghum (Sorghum bicolor L.), an emerging C4 bioenergy crop. Here, we investigated the effects of elevated O3 on photosynthesis, biomass and nutrient composition of a number of sorghum genotypes over two seasons in the field using free-air concentration enrichment (FACE), and in growth chambers. We also tested if elevated O3 altered the relationship between stomatal conductance and environmental conditions using two common stomatal conductance models. Sorghum genotypes showed significant variability in plant functional traits, including photosynthetic capacity, leaf N content and specific leaf area, but responded similarly to O3. At the FACE experiment, elevated O3 did not alter net CO2 assimilation (A), stomatal conductance (gs), stomatal sensitivity to the environment, chlorophyll fluorescence and plant biomass, but led to reductions in the maximum carboxylation capacity of phosphoenolpyruvate and increased stomatal limitation to A in both years. These findings suggest that bioenergy sorghum is tolerant to O3 and could be used to enhance biomass productivity in O3 polluted regions.
keywords:
Feedstock Production;Sustainability;Field Data
published:
2024-01-01
Supplementary data tables for the dissertation "Hybridization dynamics and population genomics of a Manacus hybrid zone." This work focuses on the dynamics of hybridization over time in two species of tropical birds, the golden-collared manakin (Manacus vitellinus) and white-collared manakin (Manacus candei) comparing data from historical museum samples and contemporary wild-caught birds. Table A1 contains the sample metadata for the Manacus Restriction site-associated DNA sequencing dataset used in the dissertation with associated NCBI Biosample Accession numbers, Smithsonian Museum of Natural History number (where applicable), sample IDs, sampling site locations, and sample information of year the sample was taken, age, and sex. Table A6 contains phenotypic measurements of male plumage traits of manakins used in cline analyses to assess hybrid zone movement over time in historical and contemporary datasets, including beard length (mm), epaulet width (mm), tail length (mm), collar color (nm), and belly color (nm). Table A7 contains a summary of male plumage measurements across the hybrid zone. Table C1 contains a list of annotated protein coding genes in candidate regions of interest in Manacus genomes using outlier regions of genomic divergence, linkage disequilibrium, and enrichment of parental private alleles.
keywords:
csv; manacus; manakin; genomics; dissertation
published:
2025-04-05
Meem, Tasneem Haq; Rhoads, Bruce; Lewis, Quinn; Umar, Muhammad; Sukhodolov, Alex
(2025)
This data set includes information on mixing metric values and distances to determine the average length scale, rates and variability of mixing downstream of 43 river confluences for 150 mixing events. The file "pmx_all data.csv" contains confluence names, the number of events per confluence site, and Pmx values measured at various actual and dimensionless downstream distances. The file "pmx_binned data.csv" provides mean Pmx values within 0.5-unit dimensionless distance bins.
keywords:
river; mixing; confluences; remote sensing
published:
2025-10-27
Jindra, Michael A.; Choe, Kisurb; Chowdhury, Ratul; Kong, Ryan; Ghaffari, Soodabeh; Sweedler, Jonathan; Pfleger, Brian
(2025)
The dominant strategy for tailoring the chain-length distribution of free fatty acids (FFA) synthesized by heterologous hosts is expression of a selective acyl-acyl carrier protein (ACP) thioesterase. However, few of these enzymes can generate a precise (greater than 90% of a desired chain-length) product distribution when expressed in a microbial or plant host. The presence of alternative chain-lengths can complicate purification in situations where blends of fatty acids are not desired. We report the assessment of several strategies for improving the dodecanoyl-ACP thioesterase from the California bay laurel to exhibit more selective production of medium-chain free fatty acids to near exclusivity. We demonstrated that matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-ToF MS) was an effective library screening technique for identification of thioesterase variants with favorable shifts in chain-length specificity. This strategy proved to be a more effective screening technique than several rational approaches discussed herein. With this data, we isolated four thioesterase variants which exhibited a more selective FFA distribution over wildtype when expressed in the fatty acid accumulating E. coli strain, RL08. We then combined mutations from the MALDI isolates to generate BTE-MMD19, a thioesterase variant capable of producing free fatty acids consisting of 90% of C12 products. Of the four mutations which conferred a specificity shift, we noted that three affected the shape of the binding pocket, while one occurred on the positively charged acyl carrier protein landing pad. Finally, we fused the maltose binding protein (MBP) from E. coli to the N – terminus of BTE-MMD19 to improve enzyme solubility and achieve a titer of 1.9 g per L of twelve-carbon fatty acids in a shake flask.
keywords:
Conversion;Genomics