Illinois Data Bank Dataset Search Results
Results
published:
2025-10-01
Wang, Yajie; Huang, Xiaoqiang; Hui, Jingshu; Vo, Lam Tung; Zhao, Huimin
(2025)
There is a growing interest in developing cooperative chemoenzymatic reactions to harness the reactivity of chemical catalysts and the selectivity of enzymes for the synthesis of nonracemic chiral compounds. However, existing chemoenzymatic systems with more than one chemical reaction and one enzymatic reaction working cooperatively are rare. Moreover, the application of oxidoreductases in cooperative chemoenzymatic reactions is limited by the necessity of using expensive and unstable redox equivalents such as nicotinamide cofactors. Here, we report a light-driven cooperative chemoenzymatic system comprised of a photoinduced electron transfer reaction (PET) and a photosensitized energy transfer reaction (PEnT) with an enzymatic reduction in one-pot to synthesize chiral building blocks of bioactive compounds. As a proof of concept, ene-reductase was directly regenerated by PET in the absence of external cofactors. Meanwhile, enzymatic reduction worked cooperatively with photocatalyst-catalyzed energy transfer that continuously replenished the reactive isomer from the less reactive one. The whole system stereoconvergently reduced E/Z mixtures of alkenes to the enantiopure products. Additionally, enantioselective enzymatic reduction worked competitively with photocatalyst-catalyzed racemic background reaction and side reactions to channel the overall electron flow to the single enantiopure product. Such a light-driven cooperative chemoenzymatic system holds great potential for asymmetric synthesis using inexpensive petroleum or biomass-derived alkenes.
keywords:
Conversion;Catalysis
published:
2025-09-29
Wang, Sheng; Guan, Kaiyu; Wang, Zhihui; Ainsworth, Elizabeth; Zheng, Ting; Townsend, Philip; Li, Kaiyuan; Moller, Christopher; Wu, Genghong; Jiang, Chongya
(2025)
The photosynthetic capacity or the CO2-saturated photosynthetic rate (Vmax), chlorophyll, and nitrogen are closely linked leaf traits that determine C4 crop photosynthesis and yield. Accurate, timely, rapid, and non-destructive approaches to predict leaf photosynthetic traits from hyperspectral reflectance are urgently needed for high-throughput crop monitoring to ensure food and bioenergy security. Therefore, this study thoroughly evaluated the state-of-the-art physically based radiative transfer models (RTMs), data-driven partial least squares regression (PLSR), and generalized PLSR (gPLSR) models to estimate leaf traits from leaf-clip hyperspectral reflectance, which was collected from maize (Zea mays L.) bioenergy plots with diverse genotypes, growth stages, treatments with nitrogen fertilizers, and ozone stresses in three growing seasons. The results show that leaf RTMs considering bidirectional effects can give accurate estimates of chlorophyll content (Pearson correlation r=0.95), while gPLSR enabled retrieval of leaf nitrogen concentration (r=0.85). Using PLSR with field measurements for training, the cross-validation indicates that Vmax can be well predicted from spectra (r=0.81). The integration of chlorophyll content (strongly related to visible spectra) and nitrogen concentration (linked to shortwave infrared signals) can provide better predictions of Vmax (r=0.71) than only using either chlorophyll or nitrogen individually. This study highlights that leaf chlorophyll content and nitrogen concentration have key and unique contributions to Vmax prediction.
keywords:
Feedstock Production;Sustainability;Biomass Analytics;Modeling
published:
2025-09-18
Jagtap, Sujit; Bedekar, Ashwini; Liu, Jing-Jing; Jin, Yong-Su; Rao, Christopher V.
(2025)
Sugar alcohols are commonly used as low-calorie sweeteners and can serve as potential building blocks for bio-based chemicals. Previous work has shown that the oleaginous yeast Rhodosporidium toruloides IFO0880 can natively produce arabitol from xylose at relatively high titers, suggesting that it may be a useful host for sugar alcohol production. In this work, we explored whether R. toruloides can produce additional sugar alcohols. Rhodosporidium toruloides is able to produce galactitol from galactose. During growth in nitrogen-rich medium, R. toruloides produced 3.2 ± 0.6 g/L, and 8.4 ± 0.8 g/L galactitol from 20 to 40 g/L galactose, respectively. In addition, R. toruloides was able to produce galactitol from galactose at reduced titers during growth in nitrogen-poor medium, which also induces lipid production. These results suggest that R. toruloides can potentially be used for the co-production of lipids and galactitol from galactose. We further characterized the mechanism for galactitol production, including identifying and biochemically characterizing the critical aldose reductase. Intracellular metabolite analysis was also performed to further understand galactose metabolism. Rhodosporidium toruloides has traditionally been used for the production of lipids and lipid-based chemicals. Our work demonstrates that R. toruloides can also produce galactitol, which can be used to produce polymers with applications in medicine and as a precursor for anti-cancer drugs. Collectively, our results further establish that R. toruloides can produce multiple value-added chemicals from a wide range of sugars.
keywords:
Conversion;Genomics;Metabolomics
published:
2025-12-15
Xiao, Tianxia; Khan, Artem; Shen, Yihui; Chen, Li; Rabinowitz, Joshua
(2025)
Ethanol and lactate are typical waste products of glucose fermentation. In mammals, glucose is catabolized by glycolysis into circulating lactate, which is broadly used throughout the body as a carbohydrate fuel. Individual cells can both uptake and excrete lactate, uncoupling glycolysis from glucose oxidation. Here we show that similar uncoupling occurs in budding yeast batch cultures of Saccharomyces cerevisiae and Issatchenkia orientalis. Even in fermenting S. cerevisiae that is net releasing ethanol, media 13C-ethanol rapidly enters and is oxidized to acetaldehyde and acetyl-CoA. This is evident in exogenous ethanol being a major source of both cytosolic and mitochondrial acetyl units. 2H-tracing reveals that ethanol is also a major source of both NADH and NADPH high-energy electrons, and this role is augmented under oxidative stress conditions. Thus, uncoupling of glycolysis from the oxidation of glucose-derived carbon via rapidly reversible reactions is a conserved feature of eukaryotic metabolism.
keywords:
Conversion;Metabolomics
published:
2025-10-01
Dai, Tao; Ellebracht, Nathan; Hunter Sellars, Elwin; Aui, Alvina; Hanna, Goldstein; Li, Wenqin; Hellwinckel, Chad; Price, Lydia; Wong, Andrew; Nico, Peter; Basso, Bruno; Robertson, G Philip; Pett-Ridge, Jennifer; Langholtz, Matthew; Baker, Sarah; Pang, Simon; Scown, Corinne
(2025)
Gigatonne-scale atmospheric carbon dioxide removal (CDR), alongside deep emission cuts, is critical to stabilizing the climate. However, some of the most scalable CDR technologies are also the most land intensive. Here, we examine whether adequate land resources exist in the contiguous United States to meet CDR targets when prioritizing grid emissions reduction, food production, and the protection of sensitive ecosystems. We focus on biomass carbon removal and storage (BiCRS) and direct air capture and storage (DACS) and show that suitable lands exceed the expected needs: 37.6 million hectares of land are available for BiCRS, resulting in 0.26 GtCO2 of CDR/year, and 34 million hectares are suitable for wind- and solar-powered DACS, resulting in 4.8 GtCO2 of CDR/year if facilities are co-located with geologic CO2 storage. We identify biomass and energy supply hotspots to meet CDR targets while ensuring land protection and minimizing land competition.
keywords:
carbon; geospatial
published:
2021-05-07
The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018), and for ORCIDs, primarily, the 2019 ORCID Public Data File https://orcid.org/.
Matching an ORCID to an individual author name on a PMID is a non-trivial process. Anyone can create an ORCID and claim to have contributed to any published work. Many records claim too many articles and most claim too few. Even though ORCID records are (most?) often populated by author name searches in popular bibliographic databases, there is no confirmation that the person's name is listed on the article. This dataset is the product of mapping ORCIDs to individual author names on PMIDs, even when the ORCID name does not match any author name on the PMID, and when there are multiple (good) candidate author names. The algorithm avoids assigning the ORCID to an article when there are no good candidates and when there are multiple equally good matches. For some ORCIDs that clearly claim too much, it triggers a very strict matching procedure (for ORCIDs that claim too much but the majority appear correct, e.g., 0000-0002-2788-5457), and sometimes deletes ORCIDs altogether when all (or nearly all) of its claimed PMIDs appear incorrect. When an individual clearly has multiple ORCIDs it deletes the least complete of them (e.g., 0000-0002-1651-2428 vs 0000-0001-6258-4628). It should be noted that the ORCIDs that claim to much are not necessarily due nefarious or trolling intentions, even though a few appear so. Certainly many are are due to laziness, such as claiming everything with a particular last name. Some cases appear to be due to test engineers (e.g., 0000-0001-7243-8157; 0000-0002-1595-6203), or librarians assisting faculty (e.g., ; 0000-0003-3289-5681), or group/laboratory IDs (0000-0003-4234-1746), or having contributed to an article in capacities other than authorship such as an Investigator, an Editor, or part of a Collective (e.g., 0000-0003-2125-4256 as part of the FlyBase Consortium on PMID 22127867), or as a "Reply To" in which case the identity of the article and authors might be conflated. The NLM has, in the past, limited the total number of authors indexed too. The dataset certainly has errors but I have taken great care to fix some glaring ones (individuals who claim to much), while still capturing authors who have published under multiple names and not explicitly listed them in their ORCID profile. The final dataset provides a "matchscore" that could be used for further clean-up.
Four files:
person.tsv: 7,194,692 rows, including header
1. orcid
2. lastname
3. firstname
4. creditname
5. othernames
6. otherids
7. emails
employment.tsv: 2,884,981 rows, including header
1. orcid
2. putcode
3. role
4. start-date
5. end-date
6. id
7. source
8. dept
9. name
10. city
11. region
12 country
13. affiliation
education.tsv: 3,202,253 rows, including header
1. orcid
2. putcode
3. role
4. start-date
5. end-date
6. id
7. source
8. dept
9. name
10. city
11. region
12 country
13. affiliation
pubmed2orcid.tsv: 13,133,065 rows, including header
1. PMID
2. au_order (author name position on the article)
3. orcid
4. matchscore (see below)
5. source: orcid (2019 ORCID Public Data File https://orcid.org/), pubmed (NLMs distributed XML files), or patci (an earlier version of ORCID with citations processed through the Patci tool)
12,037,375 from orcid; 1,06,5892 from PubMed XML; 29,797 from Patci
matchscore:
000: lastname, firstname and middle init match (e.g., Eric T MacKenzie vs
00: lastname, firstname match (e.g., Keith Ward)
0: lastname, firstname reversed match (e.g., Conde Santiago vs Santiago Conde)
1: lastname, first and middle init match (e.g., L. F. Panchenko)
11: lastname and partial firstname match (e.g., Mike Boland vs Michael Boland or Mel Ziman vs Melanie Ziman)
12: lastname and first init match
15: 3 part lastname and firstname match (David Grahame Hardie vs D Grahame Hardie)
2: lastname match and multipart firstname initial match Maria Dolores Suarez Ortega vs M. D. Suarez
22: partial lastname match and firstname match (e.g., Erika Friedmann vs Erika Friedman)
23: e.g., Antonio Garcia Garcia vs A G Garcia
25: Allan Downie vs J A Downie
26: Oliver Racz vs Oliver Bacz
27: Rita Ostrovskaya vs R U Ostrovskaia
29: Andrew Staehelin vs L A Staehlin
3: M Tronko vs N D Tron'ko
4: Sharon Dent (Also known as Sharon Y.R. Dent; Sharon Y Roth; Sharon Yoder) vs Sharon Yoder
45: Okulov Aleksei vs A B Okulov
48: Maria Del Rosario Garcia De Vicuna Pinedo vs R Garcia-Vicuna
49: Anatoliy Ivashchenko vs A Ivashenko
5 = lastname match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Bill Hieb vs W F Hieb
6 = first name match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Maria Borawska vs Maria Koscielak
7 = last or first name match on "other names"; e.g., Hromokovska Tetiana (Also known as Gromokovskaia, T. S., Громоковська Тетяна) vs T Gromokovskaia
77: Siva Subramanian vs Kolinjavadi N. Sivasubramanian
88 = no name in orcid but match caught by uniqueness of name across paper (at least 90% and 2 more than next most common name)
prefix:
C = ambiguity reduced (possibly eliminated) using city match (e.g., H Yang on PMID 24972200)
I = ambiguity eliminated by excluding investigators (ie.., one author and one or more investigators with that name)
T = ambiguity eliminated using PubMed pos (T for tie-breaker)
W = ambiguity resolved by authority2018
published:
2025-10-01
Lyu, Mingkuan; Kong, Linggen; Yang, Zhenglin; Wu, Yuting; McGhee, Claire E.; Lu, Yi
(2025)
DNAzymes have been widely used in many sensing and imaging applications but have rarely been used for genetic engineering since their discovery in 1994, because their substrate scope is mostly limited to single-stranded DNA or RNA, whereas genetic information is stored mostly in double-stranded DNA (dsDNA). To overcome this major limitation, we herein report peptide nucleic acid (PNA)-assisted double-stranded DNA nicking by DNAzymes (PANDA) as the first example to expand DNAzyme activity toward dsDNA. We show that PANDA is programmable in efficiently nicking or causing double strand breaks on target dsDNA, which mimics protein nucleases and can act as restriction enzymes in molecular cloning. In addition to being much smaller than protein enzymes, PANDA has a higher sequence fidelity compared with CRISPR/Cas under the condition we tested, demonstrating its potential as a novel alternative tool for genetic engineering and other biochemical applications.
keywords:
Conversion;Genomics;Genome Engineering
published:
2025-12-01
Mori, Jameson; Zilinger, Amber; Neumann, Julia; Pentrak, Martin; Paton, Tim; Novakofski, Jan; Mateus-Pinilla, Nohra
(2025)
This dataset measurements for the following soil components from soil samples collected in northern Illinois between 2023 and 2024. Two file formats containing the same data are offered (Excel spreadsheet and CSV):
1. Soil clay minerals (illite, kaolinite, chlorite, and smectite)
2. pH
3. Other soil minerals: aluminum (Al), arsenic (As), barium (Ba), boron aluminide (Bal), calcium (Ca), cadmium (Cd), chloride (Cl), cobalt (Co), chromium (Cr), copper (Cu), iron (Fe), magnesium (Mg), manganese (Mn), mercury (Hg), molybdenum (Mo), nobium (Nb), nickel (Ni), potassium (K), phosphorous (P), lead (Pb), palladium (Pd), rubidium (Rb), silver (Ag), sulfur (S), thorium (Th), titanium (Ti), uranium (U), vanadium (V), yttrium (Y), zinc (Zn), and zirconium (Zr)
Samples were collected on the side of public roads within the right of way. X-ray diffraction was used to quantify soil clay components, while other soil minerals were measured using a Niton XL5 Plus Analyzer. pH was measured using a Yinmik YK-S01 Digital Soil pH Tester. Samples were collected as part of a project funded by the United States Department of Agriculture Animal and Plant Inspection Service (USDA-APHIS) to examine the role of soil characteristics on chronic wasting disease (CWD) persistence in northern Illinois, USA.
keywords:
CWD; chronic wasting disease; soil; clay; pH; mineral; environmental transmission; X-ray diffraction
published:
2025-12-08
Maitra, Shraddha; Viswanathan, Mothi Bharath; Park, Kiyoul; Kannan, Baskaran; Cano Alfanar, Sofia; McCoy, Scott M.; Cahoon, Edgar; Altpeter, Fredy; Leakey, Andrew; Singh, Vijay
(2025)
Plant oils are increasingly in demand as renewable feedstocks for biodiesel and biochemicals. Currently, oilseeds are the primary source of plant oils. Although the vegetative tissues of plants express lipid metabolism pathways, they do not hyper-accumulate lipids. Elevated synthesis, storage, and accumulation of lipids in vegetative tissues have been achieved by metabolic engineering of sugarcane to produce “oilcane.” This study evaluates the potential of oilcane as a renewable feedstock for the co-production of lipids and fermentable sugars. Oilcane was grown under favorable climatic and field conditions in Florida (FLOC) as well as during an abbreviated growing season, outside its typical growing region, in Illinois (ILOC). The potential lipid yield of 0.35 tons/ha was projected from the hyperaccumulation of fatty acids in the stored vegetative biomass of FLOC, which is approaching the lipid yield of soybean (0.44 tons/ha). Processing of the vegetative tissues of oilcane recovered 0.20 tons/ha, which represents the recovery of 55% of the total lipids from FLOC. Chemical-free hydrothermal bioprocessing of ILOC and FLOC bagasse and leaves at 180 °C for 10 min prevented the degeneration of in situ plant lipids. This allowed the recovery of lipids at the end of the bioprocess with a major fraction of lipids remaining in the biomass residues after pretreatment and saccharification. Improvements through refined biomass processing, crop management, and metabolic engineering are expected to boost lipid yields and make oilcane a prime feedstock for the production of biodiesel.
keywords:
Conversion;Feedstock Production;Feedstock Bioprocessing;Lipidomics;Metabolomics
published:
2022-01-27
Li, Shuai; Moller, Christopher A.; Mitchell, Noah G.; Lee, DoKyoung; Sacks, Erik J.; Ainsworth, Elizabeth A.
(2022)
Twenty-two genotypes of C4 species grown under ambient and elevated O3 concentration were studied at the SoyFACE (40°02’N, 88°14’W) in 2019. This dataset contains leaf morphology, photosynthesis and nutrient contents measured at three time points. The results of CO2 response curves are also included.
keywords:
C4, O3, photosynthesis
published:
2025-10-17
Cai, Yingqi; Zhai, Zhiyang; Blanford, Jantana; Liu, Hui; Shi, Hai; Schwender, Jorg; Xu, Changcheng; Shanklin, John
(2025)
Storage lipids (mostly triacylglycerols, TAGs) serve as an important energy and carbon reserve in plants, and hyperaccumulation of TAG in vegetative tissues can have negative effects on plant growth. Purple acid phosphatase2 (PAP2) was previously shown to affect carbon metabolism and boost plant growth. However, the effects of PAP2 on lipid metabolism remain unknown. Here, we demonstrated that PAP2 can stimulate a futile cycle of fatty acid (FA) synthesis and degradation, and mitigate negative growth effects associated with high accumulation of TAG in vegetative tissues. Constitutive expression of PAP2 in Arabidopsis thaliana enhanced both lipid synthesis and degradation in leaves and led to a substantial increase in seed oil yield. Suppressing lipid degradation in a PAP2-overexpressing line by disrupting sugar-dependent1 (SDP1), a predominant TAG lipase, significantly elevated vegetative TAG content and improved plant growth. Diverting FAs from membrane lipids to TAGs in PAP2-overexpressing plants by constitutively expressing phospholipid:diacylglycerol acyltransferase1 (PDAT1) greatly increased TAG content in vegetative tissues without compromising biomass yield. These results highlight the potential of combining PAP2 with TAG-promoting factors to enhance carbon assimilation, FA synthesis and allocation to TAGs for optimized plant growth and storage lipid accumulation in vegetative tissues.
keywords:
Feedstock Production;Biomass Analytics;Lipidomics
published:
2025-10-17
Deewan, Anshu; Liu, Jing-Jing; Jagtap, Sujit Sadashiv; Yun, Eun Ju; Walukiewicz, Hanna E.; Jin, Yong-Su; Rao, Christopher V.
(2025)
Oleaginous yeasts have received significant attention due to their substantial lipid storage capability. The accumulated lipids can be utilized directly or processed into various bioproducts and biofuels. Lipomyces starkeyi is an oleaginous yeast capable of using multiple plant-based sugars, such as glucose, xylose, and cellobiose. It is, however, a relatively unexplored yeast due to limited knowledge about its physiology. In this study, we have evaluated the growth of L. starkeyi on different sugars and performed transcriptomic and metabolomic analyses to understand the underlying mechanisms of sugar metabolism. Principal component analysis showed clear differences resulting from growth on different sugars. We have further reported various metabolic pathways activated during growth on these sugars. We also observed non-specific regulation in L. starkeyi and have updated the gene annotations for the NRRL Y-11557 strain. This analysis provides a foundation for understanding the metabolism of these plant-based sugars and potentially valuable information to guide the metabolic engineering of L. starkeyi to produce bioproducts and biofuels.
keywords:
Conversion;Metabolomics;Transcriptomics
published:
2019-08-29
Nardulli, Peter; Peyton, Buddy; Bajjalieh, Joseph; Singh, Ajay; Martin, Michael; Shalmon, Dan; Althaus, Scott
(2019)
This is part of the Cline Center’s ongoing Social, Political and Economic Event Database Project (SPEED) project. Each observation represents an event involving civil unrest, repression, or political violence in Sierra Leone, Liberia, and the Philippines (1979-2009). These data were produced in an effort to describe the relationship between exploitation of natural resources and civil conflict, and to identify policy interventions that might address resource-related grievances and mitigate civil strife.
This work is the result of a collaboration between the US Army Corps of Engineers’ Construction Engineer Research Laboratory (ERDC-CERL), the Swedish Defence Research Agency (FOI) and the Cline Center for Advanced Social Research (CCASR). The project team selected case studies focused on nations with a long history of civil conflict, as well as lucrative natural resources.
The Cline Center extracted these events from country-specific articles published in English by the British Broadcasting Corporation (BBC) Summary of World Broadcasts (SWB) from 1979-2008 and the CIA’s Foreign Broadcast Information Service (FBIS) 1999-2004. Articles were selected if they mentioned a country of interest, and were tagged as relevant by a Cline Center-built machine learning-based classification algorithm. Trained analysts extracted nearly 10,000 events from nearly 5,000 documents. The codebook—available in PDF form below—describes the data and production process in greater detail.
keywords:
Cline Center for Advanced Social Research; civil unrest; Social Political Economic Event Dataset (SPEED); political; event data; war; conflict; protest; violence; social; SPEED; Cline Center; Political Science
published:
2022-01-20
This dataset provides a 50-state (and DC) survey of state-level tax credits modeled after the federal New Markets Tax Credit program, including summaries of the tax credit amount and credit periods, key definitions, eligibility criteria, application process, and degree of conformity to federal law.
keywords:
New Markets Tax Credits; NMTC; tax incentives; state law
published:
2025-09-18
Saifuddin, Mustafa; Bhatnagar, Jennifer; Segrè, Daniel; Finzi, Adrien C.
(2025)
Respiration by soil bacteria and fungi is one of the largest fluxes of carbon (C) from the land surface. Although this flux is a direct product of microbial metabolism, controls over metabolism and their responses to global change are a major uncertainty in the global C cycle. Here, we explore an in silico approach to predict bacterial C-use efficiency (CUE) for over 200 species using genome-specific constraint-based metabolic modeling. We find that potential CUE averages 0.62 ± 0.17 with a range of 0.22 to 0.98 across taxa and phylogenetic structuring at the subphylum levels. Potential CUE is negatively correlated with genome size, while taxa with larger genomes are able to access a wider variety of C substrates. Incorporating the range of CUE values reported here into a next-generation model of soil biogeochemistry suggests that these differences in physiology across microbial taxa can feed back on soil-C cycling.
keywords:
Sustainability;Metabolomics;Modeling
published:
2025-08-21
Lu, Yi; Sweedler, Jonathan; Zhou, Shuaizhen; Zhou, Yu
(2025)
Engineering efficient biocatalysts is essential for metabolic engineering to produce valuable bioproducts from renewable resources. However, due to the complexity of cellular metabolic networks, it is challenging to translate success in vitro into high performance in cells. To meet such a challenge, an accurate and efficient quantification method is necessary to screen a large set of mutants from complex cell culture and a careful correlation between the catalysis parameters in vitro and performance in cells is required. In this study, we employed a mass-spectrometry based high-throughput quantitative method to screen new mutants of 2-pyrone synthase (2PS) for triacetic acid lactone (TAL) biosynthesis through directed evolution in E. coli. From the process, we discovered two mutants with the highest improvement (46 fold) in titer and the fastest kcat (44 fold) over the wild type 2PS, respectively, among those reported in the literature. A careful examination of the correlation between intracellular substrate concentration, Michaelis-Menten parameters and TAL titer for these two mutants reveals that a fast reaction rate under limiting intracellular substrate concentrations is important for in-cell biocatalysis. Such properties can be tuned by protein engineering and synthetic biology to adopt these engineered proteins for the maximum activities in different intracellular environments.
keywords:
catalysis; mass spectrometry; metabolic engineering
published:
2025-10-13
Schultz, J. Carl; Mishra, Shekhar; Gaither, Emily; Mejia, Andrea; Dinh, Hoang V.; Maranas, Costas D.; Zhao, Huimin
(2025)
The oleaginous, carotenogenic yeast Rhodotorula toruloides has been increasingly explored as a platform organism for the production of terpenoids and fatty acid derivatives. Fatty alcohols, a fatty acid derivative widely used in the production of detergents and surfactants, can be produced microbially with the expression of a heterologous fatty acyl-CoA reductase. Due to its high lipid production, R. toruloides has high potential for fatty alcohol production, and in this study several metabolic engineering approaches were investigated to improve the titer of this product. Fatty acyl-CoA reductase from Marinobacter aqueolei was co-expressed with SpCas9 in R. toruloides IFO0880 and a panel of gene overexpressions and Cas9-mediated gene deletions were explored to increase the fatty alcohol production. Two overexpression targets (ACL1 and ACC1, improving cytosolic acetyl-CoA and malonyl-CoA production, respectively) and two deletion targets (the acyltransferases DGA1 and LRO1) resulted in significant (1.8 to 4.4-fold) increases to the fatty alcohol titer in culture tubes. Combinatorial exploration of these modifications in bioreactor fermentation culminated in a 3.7 g/L fatty alcohol titer in the LRO1Δ mutant. As LRO1 deletion was not found to be beneficial for fatty alcohol production in other yeasts, a lipidomic comparison of the DGA1 and LRO1 knockout mutants was performed, finding that DGA1 is the primary acyltransferase responsible for triacylglyceride production in R. toruloides, while LRO1 disruption simultaneously improved fatty alcohol production, increased diacylglyceride and triacylglyceride production, and increased glucose consumption. The fatty alcohol titer of fatty acyl-CoA reductase-expressing R. toruloides was significantly improved through the deletion of LRO1, or the deletion of DGA1 combined with overexpression of ACC1 and ACL1. Disruption of LRO1 surprisingly increased both lipid and fatty alcohol production, creating a possible avenue for future study of the lipid metabolism of this yeast.
keywords:
Conversion;Genome Engineering;Genomics
published:
2025-11-07
Lee, Ye-Gi; Kang, Nam Kyu; Kim, Chanwoo; Tran, Vinh; Cao, Mingfeng; Yoshikuni, Yasuo; Zhao, Huimin; Jin, Yong-Su
(2025)
This study presents a cost-effective strategy for producing organic acids from glucose and xylose using the acid-tolerant yeast, Issatchenkia orientalis. I. orientalis was engineered to produce lactic acid from xylose, and the resulting strain, SD108XL, successfully converted sorghum hydrolysates into lactic acid. In order to enable low-pH fermentation, a self-buffering strategy, where the lactic acid generated by the SD108XL strain during fermentation served as a buffer, was developed. As a result, the SD108 strain produced 67 g/L of lactic acid from 73 g/L of glucose and 40 g/L of xylose, simulating a sugar composition of sorghum biomass hydrolysates. Moreover, techno-economic analysis underscored the efficiency of the self-buffering strategy in streamlining the downstream process, thereby reducing production costs. These results demonstrate the potential of I. orientalis as a platform strain for the cost-effective production of organic acids from cellulosic hydrolysates.
keywords:
Conversion;Gene Editing;Hydrolysate;Metabolic Engineering
published:
2022-06-22
Kang, Jeon-Young; Farkhad, Bita Fayaz; Chan, Man-pui Sally; Michels, Alexander; Albarracin, Dolores; Wang, Shaowen
(2022)
This dataset helps to investigate the Spatial Accessibility to HIV Testing, Treatment, and Prevention Services in Illinois and Chicago, USA.
The main components are: population data, healthcare data, GTFS feeds, and road network data. The core components are:
1) `GTFS` which contains GTFS (<a href="https://gtfs.org/">General Transit Feed Specification</a>) data which is provided by Chicago Transit Authority (CTA) from <a href="https://developers.google.com/transit/gtfs">Google's GTFS feeds</a>. Documentation defines the format and structure of the files that comprise a GTFS dataset: <a href="https://developers.google.com/transit/gtfs/reference?csw=1">https://developers.google.com/transit/gtfs/reference?csw=1</a>.
2) `HealthCare` contains shapefiles describing HIV healthcare providers in Chicago and Illinois respectively. The services come from <a href="https://locator.hiv.gov/">Locator.HIV.gov</a>.
3) `PopData` contains population data for Chicago and Illinois respectively. Data come from The American Community Survey and <a href="https://map.aidsvu.org/map">AIDSVu</a>. AIDSVu (https://map.aidsvu.org/map) provides data on PLWH in Chicago at the census tract level for the year 2017 and in the State of Illinois at the county level for the year 2016. The American Community Survey (ACS) provided the number of people aged 15 to 64 at the census tract level for the year 2017 and at the county level for the year 2016. The ACS provides annually updated information on demographic and socio economic characteristics of people and housing in the U.S.
4) `RoadNetwork` contains the road networks for Chicago and Illinois respectively from <a href="https://www.openstreetmap.org/copyright">OpenStreetMap</a> using the Python <a href="https://osmnx.readthedocs.io/en/stable/">osmnx</a> package.
<b>The abstract for our paper is:</b>
Accomplishing the goals outlined in “Ending the HIV (Human Immunodeficiency Virus) Epidemic: A Plan for America Initiative” will require properly estimating and increasing access to HIV testing, treatment, and prevention services. In this research, a computational spatial method for estimating access was applied to measure distance to services from all points of a city or state while considering the size of the population in need for services as well as both driving and public transportation. Specifically, this study employed the enhanced two-step floating catchment area (E2SFCA) method to measure spatial accessibility to HIV testing, treatment (i.e., Ryan White HIV/AIDS program), and prevention (i.e., Pre-Exposure Prophylaxis [PrEP]) services. The method considered the spatial location of MSM (Men Who have Sex with Men), PLWH (People Living with HIV), and the general adult population 15-64 depending on what HIV services the U.S. Centers for Disease Control (CDC) recommends for each group. The study delineated service- and population-specific accessibility maps, demonstrating the method’s utility by analyzing data corresponding to the city of Chicago and the state of Illinois. Findings indicated health disparities in the south and the northwest of Chicago and particular areas in Illinois, as well as unique health disparities for public transportation compared to driving. The methodology details and computer code are shared for use in research and public policy.
keywords:
HIV;spatial accessibility;spatial analysis;public transportation;GIS
published:
2025-10-02
Jin, Yong-Su; Rao, Christopher; Ye, Quanhui; Oh, Hyunjoon; Tohidifar, Payman; Koh, Hyun Gi
(2025)
For economic and sustainable biomanufacturing, the oleaginous yeast Rhodotorula toruloides has emerged as a promising platform for producing biofuels, pharmaceuticals, and other valuable chemicals. However, genetic manipulation of R. toruloides has been limited by its high GC content and the lack of a replicating plasmid, necessitating gene integration into the genome of the yeast. To address these challenges, we developed the RT-EZ (R. toruloides Efficient Zipper) toolkit, a versatile tool based on Golden Gate assembly, designed to streamline R. toruloides engineering with improved efficiency and flexibility. The RT-EZ toolkit simplifies vector construction by incorporating new features such as bidirectional promoters and 2A peptides, color-based screening using RFP, and sequences optimized for both Agrobacterium tumefaciens-mediated transformation (ATMT) and easy linearization, enabling straightforward selection and transformation. Notably, the RT-EZ kit can be used to construct an expression cassette with four different genes in one assembly reaction, significantly improving vector construction speed and efficiency. The utility of the RT-EZ toolkit was demonstrated through the successful synthesis of arachidonic acid in R. toruloides by coexpressing fatty acid elongases and desaturases. This result underscores the potential of the RT-EZ toolkit to advance synthetic biology in R. toruloides, providing a streamlined method for addressing genetic engineering challenges in the yeast.
keywords:
gene editing; genome engineering
published:
2020-11-20
Jaikumar, Nikhil; Clemente, Tom; Long, Steve; Ge, Zhengxiang; Changa, Timothy
(2020)
This data set explores the effect of the cyanobacterial gene ictB on photosynthesis in sorghum, under both normal greenhouse growing temperatures (32 C / 25 C) and during and after an 8 day chilling stress (10 C / 5 C). IctB is a cyanobacterial gene of unknown function, which was initially thought to be involved in inorganic carbon transport into cells. While ictB is known now not to be an independently active carbon transporter in its own right, it may play a role in passive diffusion of metabolites. This transgene was introduced into sorghum by the lab of Thomas Clemente, through Agrobacterium mediated transformation, alone and in combination with the tomato sedoheptulose-1,7-bisphosphatase (SBPase) gene. Eleven events (six double construct and five single construct ictB) were involved in this study. SBPase was included because some previous experiments in C3 species and some previous modeling work, as well as its position at a metabolic branch point, indicates it plays a role as a control point for photosynthesis. A chilling treatment was included because chilling is one of the most serious ecological factors limiting the range of C4 species.
Data includes gene expression, metabolomics (at normal growing temperature), SBPase enzyme activity, biomass and photosynthetic traits at both warm temperature and during and after chilling stress.
-----------------
EXPLANATORY NOTES FOR ICTB/SBPASE SORGHUM MANUSCRIPT
Data are organized into 10 worksheets, representing an expected 10 tables that will serve a supplementary role in the final publication. These include data on gene expression, metabolomics (at normal growing temperature), SBPase enzyme activity, biomass and photosynthetic traits at both warm temperature and during and after chilling stress.
<i><b>Tables are as follows:</i></b>
1. Event_Code: for Table S1. Event codes for events and constructs. Two constructs were generated for this study, and numerous transgenic “events” (i.e. independent transformations) were carried out for each construct. A construct represents the actual vector which was introduced into the plants (complete with promoter, gene of interest, marker gene, etc.) while an event represents a single successful introduction of the transgene. Events are uniquely labeled with letter and number strings but also with a four-digit number for ease of reference, this table explains which event corresponds to each four-digit number.
2. Photosynthetic_Data: for Table S2. Photosynthetic data at greenhouse growing temperature, for ictB single construct, ictB/SBPase double construct, and wild type lines. Five ictB and six ictB/SBPase events were included. Greenhouse growing temperature was approximately 32 °C and 25 °C night. Photosynthetic parameters were measured using a Licor 6400-XT, and included parameters related to carbon dioxide uptake, water loss, and chlorophyll fluorescence.
3. Chilling_Treatment: for Table S3. Photosynthetic response to chilling treatment, for ictB single construct, and wild type lines. Four ictB events were included. Chilling treatment lasted approximately 8 days and began either 3.5 or 5.5 weeks after transplanting the plants (chilling was done in two batches). Chilling treatment involved temperatures of 10 °C day / 7 °C night in growth chambers. Photosynthetic parameters were measured at several time points during and after the chilling treatment, were measured using a Licor 6400-XT, and included parameters related to carbon dioxide uptake, water loss, and chlorophyll fluorescence.
4. SBPase_Activity: for Table S4. SBPase activity in double construct plants. These data measure in vitro substrate-saturated activity of SBPase in desalted extracts from leaf tissues, at 25 °C. Units are micromoles of SBP processed per second per m2 of leaf tissue. Five ictB/SBPase events were included.
5. 2014_gene_exp: for Table S5. Gene expression in 2014 experiment (units of cycle times). These data measure cycle times to threshold, relative to reference genes, for expression of ictB and SBPase. Six ictB single construct events and five ictB/SBPase double construct events were included. Cycle times to threshold relative to reference genes (ΔCT) are inversely related to number of transcripts relative to reference genes, as follows:
ΔCT = -log2([NictB]/[Nreference])/[1 + log2b] where b = efficiency of replication.
6. 2016_gene_exp: for Table S5. Gene expression in 2016 experiment (units of cycle times). These data measure cycle times to threshold, relative to reference genes, for expression of ictB and SBPase. Six ictB single construct events and five ictB/SBPase double construct events were included. Cycle times to threshold relative to reference genes (ΔCT) are inversely related to number of transcripts relative to reference genes, as follows:
ΔCT = -log2([NictB]/[Nreference])/[1 + log2b] where b = efficiency of replication.
7. Metabolites: for Table S7. Levels of 267 metabolites in leaf tissue. Four ictB single construct events and four ictB/SBPase double construct events were included in these analyses. Metabolites were measured in methanol-extracted samples, either by liquid chromatography / mass spectrometry or by gas chromatography / mass spectrometry, and were compared between events on a relative basis. As quantification was relative to wild type rather than on an absolute basis, no units are included.
8. Metabolite_F_values: for Table S8. F values for effects of ictB, SBPase (in cases where the model was better with a SBPase effect) and event. These analyses are done for each metabolite included in Table S7, and show effects of the explanatory variables ictB, SBPase, and individual event.
9. Biomass_2020: for Table S9. Biomass and grain yield at harvest, for ictB, ictB/SBPase and wild type sorghum plants in spring 2020. Four ictb/SBPase double construct and four ictB single construct events were included.
10. Biomass_2017: for Table S10. Biomass and grain yield at harvest, in chilled and non-chilled sorghum plants containing the ictB transgene (along with wild type controls) in fall 2017. Four ictB single construct events were included. Chilling treatment involved temperatures of 10 °C day / 7 °C night in growth chambers.
<i><b>All the variables in the file are explained as below:</i></b>
o Type (IctB-SBPase and IctB). This refers to whether a plant is wild type, single construct (contains only the ictB transgene) or double construct (contains both the ictB and SBPase transgenes).
o Code: these codes are shorter labels to refer to each transgene event for the sake of convenience.
o Alternate_Code: these codes are shorter labels to refer to each transgene event for the sake of convenience.
o Event Number: these are unique labels for each transgenic events.
o Construct Number: these are labels for each transgenic construct (either the ictB single construct or the ictB/SBPase double construct).
o year (i): this refers to the year in which the study was conducted (2014, 2016, 2017, or 2020)
o transgene or Transgenic: whether the transgene was present
o construct or Type : whether the ictB or the ictB/SBPase construct was present (double, single, wildtype):
o temp: leaf temperature during the measurement
o A: carbon assimilation rate, in μmol m-2 s-1
o gs: stomatal conductance, in mol m-2 s-1
o CI: intercellular carbon dioxide concentration, in parts per million or μL L-1
o fvfm:FV’/FM’ (maximal potential photosystem II quantum yield under light adapted conditions), dimensionless ratio
o phipsill: ΦPSII (maximal potential photosystem II quantum yield under light adapted conditions), dimensionless ratio
o qP: photochemical quenching, i.e. ratio of ΦPSII to FV’/FM’ , dimensionless ratio
o iwue: intrinsic water use efficiency, i.e. ratio of carbon assimilation rate to stomatal conductance, in units of μmol mol-1
o event: individual transgenic / transformation event
o Vmax: substrate-saturated in vitro activity of the SBPase enzyme, in μmol m-2 s-1
o ID: identification number of sample
o ΔCT1: difference in cycle times to threshold during gene expression (quantitative PCR) assay, between ictB and the reference gene GAPDH, in units of cycles
o ΔCT2: cycle times to threshold during gene expression (quantitative PCR) assay, between SBPase and the reference gene GAPDH, in units of cycles
o GAPDH: cycle times to threshold for the reference gene GAPDH (glyceraldehyde phosphate dehydrogenase)
o IctB: cycle times to threshold for the gene of interest ictB
o SBPase: cycle times to threshold for the gene of interest SBPase
o v1 to v267 represent individual metabolite (see the heading immediately above the labels v1, v2, etc.). Variables v268-v272 refer to total (summed) metabolite levels for particular pathways of interest.
o leaf: Leaf and stem dry biomass (in grams)
o seed: Seedhead dry biomass (in grams)
o biomass: Total (leaf, stem + seed head) dry biomass (in grams)
o harvind: ratio of seed head dry biomass to total dry biomass
o treatment (chilled and nonchilled): “Chilled” plants were grown under warm greenhouse conditions (32 °C day / 25 °C night) for 6 or 8 weeks, then switched to chilling temperatures under growth chamber conditions (10 °C / 7 °C night) for 8 days, and were then returned to greenhouse growing conditions.
-----------------
keywords:
ictB; SBPase; photosynthesis; sorghum; chilling
published:
2024-04-15
Lyu, Zhiheng; Lehan, Yao; Zhisheng, Wang; Chang, Qian; Zuochen, Wang; Jiahui, Li; Yufeng, Wang; Qian, Chen
(2024)
The dataset contains trajectories of Pt nanoparticles in 1.98 mM NaBH4 and NaCl, tracked under liquid-phase TEM. The coordinates (x, y) of nanoparticles are provided, together with the conversion factor that translates pixel size to actual distance. In the file, ∆t denotes the time interval and NaN indicates the absence of a value when the nanoparticle has not emerged or been tracked. The labeling of nanoparticles in the paper is also noted in the second row of the file.
keywords:
nanomotor; liquid-phase TEM
published:
2021-05-10
Varela Quintela, Sebastian; Leakey, Andrew
(2021)
UAV-based high-resolution multispectral time-series orthophotos utilized to understand the relation between growth dynamics, imagery temporal resolution, and end-of-season biomass productivity of biomass sorghum as bioenergy crop. Sensor utilized is a RedEdge Micasense flown at 40 meters above ground level at the Energy Farm- UIUC in 2019.
keywords:
Unmanned aerial vehicles; High throughput phenotyping; Machine learning; Bioenergy crops
published:
2022-10-14
Dietrich, Christopher; Dmitriev, Dmitry; Takiya, Daniela; Thomas, Michael; Webb, Michael D; Zahniser, James; Zhang, Yalin
(2022)
The Membracoidea_morph_data_Final.nex text file contains the original data used in the phylogenetic analyses of Dietrich et al. (Insect Systematics and Diversity, in review). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The complete taxon names corresponding to the 131 genus names listed under “BEGIN TAXA” are listed in Table 1 in the included PDF file “Taxa_and_characters”; the 229 morphological characters (names abbreviated under under “BEGIN CHARACTERS” are fully explained in the list of character descriptions following Table 1 in the same PDF). The data matrix follows “MATRIX” and gives the numerical values of characters for each taxon. Question marks represent missing data. The lists of characters and taxa and details on the methods used for phylogenetic analysis are included in the submitted manuscript.
keywords:
leafhopper; treehopper; evolution; Cretaceous; Eocene
published:
2021-01-25
Zenzal, T. J. ; Ward, Michael; Diehl, Rob; Buler, Jeffrey; Smolinsky, Jaclyn; Deppe, Jill; Bolus, Rachel; Celis-Murillo, Antonio; Moore, Frank
(2021)
Dataset associated with Zenzal et al. Oikos submission: Retreat, detour, or advance? Understanding the movements of birds confronting the Gulf of Mexico. https://doi.org/10.1111/oik.07834
Four CSV files were used for analysis and are related to the following subsections under the “Statistics” heading in the “Materials and Methods” section of the journal article:
1. Departing the Edge = “AIC Analysis.csv”
2. Comparing Retreating to Advancing = “Advance and Retreat Analysis.csv” and “Wind Data at Departure.csv”
3. Food Abundance = “Fruit Data.csv” and “Arthropod Data.csv”
<b>Description of variables:</b>
Year: the year in which data were collected.
Departure: the direction in which an individual departed the Bon Secour National Wildlife Refuge. “North” indicates an individual that departed ≥315° or <45°; “Circum” indicates an individual that departed east (45 – 134°) or west ( 225 – 314°); “Trans” indicates an individual that departed south (135 – 224°).
Age: the age of an individual at capture. Individuals were aged as hatch year (HY) or after hatch year (AHY) according to Pyle (1997; see related article for full citation).
Fat: the fat score of an individual at capture. Individuals were scored on a 6-point scale ranging from 0-5 following Helms and Drury (1960; see related article for full citation).
Species: the standardized four letter alphabetic code used as an abbreviation for English common names of North American Birds. SWTH: Catharus ustulatus; REVI: Vireo olivaceus; INBU: Passerina cyanea; WOTH: Hylocichla mustelina; RTHU: Archilochus colubris.
FTM_SD: stopover duration or number of days between first capture and departure from automated radio telemetry system coverage at the Bon Secour National Wildlife Refuge.
TMB_SD: stopover duration or number of days between first and last detection from automated radio telemetry systems north of Mobile Bay, AL, USA.
Mean speed north (km/hr): the northbound travel speed of individuals retreating from the Bon Secour National Wildlife Refuge by determining the time when the signal strength indicated the bird was directly east or west of the automated telemetry system and dividing the amount of time it took for an individual to move in an assumed straight path between the Refuge systems and those north of Mobile Bay, AL, USA.
Mean speed south (km/hr): the southbound travel speed of individuals advancing from north of Mobile Bay, AL, USA by determining the time when the signal strength indicated the bird was directly east or west of the automated telemetry system and dividing the amount of time it took for an individual to move in an assumed straight path between the Refuge systems and those north of Mobile Bay, AL, USA.
LN_FTM_DEP_TIME: the natural log of departure time from the Bon Secour National Wildlife Refuge. Departure time is defined as the number of hours before or after civil twilight.
LN_TMB_DEP_TIME: the natural log of departure time from north of Mobile Bay, AL, USA. Departure time is defined as the number of hours before or after civil twilight.
Paired_FTM_DEP_TIME: the departure time or number of hours before or after civil twilight from Bon Secour National Wildlife Refuge.
Paired_TMB_DEP_TIME: the departure time or number of hours before or after civil twilight from north of Mobile Bay, AL, USA.
Wind Direction: the direction from which the wind originated at the Bon Secour National Wildlife Refuge on nights when individuals were departing. “N” indicates winds from the north (≥315° or <45°); “E” indicates winds from the east (45 – 134°); “W” indicates winds from the west ( 225 – 314°); “S” indicates winds from the south (135 – 224°).
Wind Speed (m/s): the wind speed on nights when individuals were departing the Bon Secour National Wildlife Refuge.
Group: the direction the bird was traveling under specific wind conditions. Northbound individuals traveled north from Bon Secour National Wildlife Refuge. Southbound individuals traveled south from habitats north of Mobile Bay, AL, USA.
Fruit: weekly mean number of ripe fruit per meter.
Site: the site from which the data were collected. FTM is located within the Bon Secour National Wildlife Refuge. TMB is located within the Jacinto Port Wildlife Management Area.
DOY: number indicating day of year (i.e., 1 January = 001….31 December = 365).
Arthropod Biomass: estimated mean arthropod biomass from each sampling period.
<b>Note:</b> Empty cells indicate unavailable data where applicable.
keywords:
migratory birds; migration; automated telemetry; Gulf of Mexico