Illinois Data Bank Dataset Search Results
Results
published:
2024-07-29
Caetano Machado Lopes, Lorran; Chacko, George
(2024)
This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15.
The dataset comprises two compressed (.xz) files.
1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types:
⢠openalex_id: A unique identifier from the Open Alex catalog.
⢠integer_id: An integer representing the new identifier (assigned by the authors)
⢠hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes).
2) filename: citation_table.tsv.xz
This edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively.
Summary Features
⢠Total Nodes (Documents): 256,997,006
⢠Total Edges (citations): 2,148,871,058
⢠Documents with DOIs: 163,495,446
⢠Edges between documents with DOIs: 1,936,722,541 [corrected to 2,148,788,148 edges Nov 13, 2025]
⢠Count of unique nodes in edgelist 111,453,719 [updated Nov 13, 2025]
Note: Nov 13, 2025. An improved curation process will be applied to a future version of this dataset
Note: Nov 13, 2025.
The code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/
keywords:
citation networks; Open Alex
published:
2021-05-17
Wuebbles, D; Angel, J; Petersen, K; Lemke, A.M.
(2021)
Please cite as: Wuebbles, D., J. Angel, K. Petersen, and A.M. Lemke, (Eds.), 2021: An Assessment of the Impacts of Climate Change in Illinois. The Nature Conservancy, Illinois, USA. https://doi.org/10.13012/B2IDB-1260194_V1
Climate change is a major environmental challenge that is likely to affect many aspects of life in Illinois, ranging from human and environmental health to the economy. Illinois is already experiencing impacts from the changing climate and, as climate change progresses and temperatures continue to rise, these impacts are expected to increase over time. This assessment takes an in-depth look at how the climate is changing now in Illinois, and how it is projected to change in the future, to provide greater clarity on how climate change could affect urban and rural communities in the state. Beyond providing an overview of anticipated climate changes, the report explores predicted effects on hydrology, agriculture, human health, and native ecosystems.
keywords:
Climate change; Illinois; Public health; Agriculture; Environment; Water; Hydrology; Ecosystems
published:
2026-02-25
Bayer, Hugo; Binette , Annalise; Sweck, Samantha; Juliano, Vitor; Plas, Samantha; Ferst, Lara; Hassell Jr, James; Maren, Stephen
(2026)
Raw data from the article "Locus Coeruleus-Amygdala Circuit Disrupts Prefrontal Control to Impair Fear Extinction", which is accepted for publication in PNAS.
keywords:
Basolateral Amygdala; Fear conditioning; Infralimbic cortex; Learning and Memory; Norepinephrine
published:
2026-02-10
Ejiogu, Emmanuel; Peters, Baron
(2026)
This dataset contains the jupyter notebook and microsoft excel data used to reproduce the results from the eponymous paper.
1. "pourahmady data.xlsx" contains NMR data for triad and dyad sequences in a PVC/Polyethylene copolymer.
V is a vinyl chloride segment (-CH2CHCl-) and E is an ethylene segment (-CH2CH2-)
VE is the dyad -CH2CHCl-CH2CH2-
VC_frac_1 = fraction of vinyl chloride segments obtained from 13C-NMR
VC_frac_2 = fraction of vinyl chloride segments obtained from elemental analysis
2. "Triad_Kinetics.ipynb" contains code that fit data from "pourahmady data.xlsx"
published:
2026-02-20
Emran, Shah-Al; Petersen, Bryan M; Roney, Heather Elizabeth ; Masters, Michael David ; Varela, Sebastian; Hedrick, Travis; Leakey, Andrew D.B. ; VanLoocke, Andy; Heaton, Emily A.
(2026)
This dataset contains biomass yield measurements and associated vegetation index data collected from commercial Miscanthus Ă giganteus fields in eastern Iowa during the 2022â2023 growing seasons.
The data support the analyses presented in the article:
âYield From Iowa's First Commercial Miscanthus Fields: Implications of Spatial Variability for Productivity and Sustainability Beyond Research Plots.â
We collected 105 ground-truth biomass samples from four mature commercial fields (>4 years old) covering 92.81 ha.
Samples were taken from 3 m² quadrats that were hand-harvested in alignment with commercial harvest timing. Stem biomass (excluding leaves) was weighed, moisture-corrected, and converted to dry-matter yield expressed in Mg DM haâťÂš.
Sampling locations were selected to capture spatial variability visible in aerial imagery and were recorded using RTK GPS.
Each biomass observation was paired with vegetation indices derived from high-resolution PlanetScope satellite imagery (3 m resolution).
Images were acquired throughout the growing season, and indices were calculated to evaluate their ability to predict end-of-season biomass yield.
Statistical and machine learning approaches were used to identify key predictors, and a linear regression model based on end-of-July Green Normalized Difference Vegetation Index (GNDVI) was developed and evaluated.
This repository includes the data used in that modeling workflow. Management practices, economic data, full imagery time series, and additional methodological details are described in the associated publication and are not included here.
The dataset consists of three comma-separated value (CSV) files:
1. Combine_Groundtruth_Yield_VI_22_23.csv
This file contains ground-truth biomass yield measurements and associated key vegetation index values collected during the 2022 and 2023 growing seasons.
Rows: 105 observations
Columns:
Year â Year of observation (2022 or 2023)
Field â Field location identifier
Sample_number â Unique sample identifier
GNDVI_End_Jul â Green Normalized Difference Vegetation Index calculated at end of July
GNDVI_End_Aug â Green Normalized Difference Vegetation Index calculated at end of August
NDRE_End_Aug â Normalized Difference Red Edge index calculated at end of August
Biomass_Stem_Yield_MgDM/ha â Measured stem biomass yield (megagrams dry matter per hectare)
2. trainData_GNDVI.csv
This file contains the subset of observations used to train the predictive relationship between July GNDVI and biomass yield.
Rows: 76 observations
Columns:
Unnamed: 0 â Row index retained from the original data processing workflow
GNDVI_End_Jul â GNDVI at end of July
Stem_Yield_MgDM/ha â Observed stem biomass yield (Mg DM haâťÂš)
3. testData_GNDVI.csv
This file contains the test dataset used to evaluate model performance.
Rows: 29 observations
Columns:
Unnamed: 0 â Row index retained from the original data processing workflow
GNDVI_End_Jul â GNDVI at end of July
Predicted_Yield_MgDM/ha â Model-predicted stem biomass yield (Mg DM haâťÂš)
Observed_Yield_MgDM/ha â Measured stem biomass yield (Mg DM haâťÂš)
keywords:
Potential yield, yield gap, in-field management, yield prediction, remote sensing, spatial variability, profitability, Miscanthus Ă giganteus, MĂg
published:
2026-02-17
Peyton, Buddy; Bajjalieh, Joseph; Martin, Michael; Gerald, Andrea
(2026)
Coups d'Ătat are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019, Chin, Carter and Wright 2021). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup dâĂtat Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e., realized, unrealized, or conspiracy) the type of actor(s) who initiated the coup (i.e., military, rebels, etc.), as well as the fate of the deposed leader.
Version 2.2.2 corrects an error in version 2.2.1 in which the âconspiracyâ designation was mistakenly assigned to coup_id: 40411262025. Version 2.2.2 resolves this issue by removing the incorrect designation.
Version 2.2.1 adds 67 additional coup events. 47 of these came from examining the Colpus dataset (Chin, Carter, and Wright 2021), and 20 of these events were added to the data set in the normal annual review of potential new coup events. This version also updates the coding to events in Mali in 2012, Serbia in 2000 and Chad in 1979.
Version 2.2.0 adds 94 additional coup events. 66 of these came from examining Powell and Thyneâs âdiscardedâ events and 28 of these events were added to the data set in the normal annual review of potential new coup events. This version also updates the coding to events in Brazil in 1945 and the Congo in 1968.
Version 2.1.3 adds 19 additional coup events to the data set, corrects the date of a coup in Tunisia, and reclassifies an attempted coup in Brazil in December 2022 as a conspiracy.
Version 2.1.2 added 6 additional coup events that occurred in 2022 and updated the coding of an attempted coup event in Kazakhstan in January 2022.
Version 2.1.1 corrected a mistake in version 2.1.0, where the designation of âdissident coupâ had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixed this omission by marking the case as both a dissident coup and an auto-coup.
Version 2.1.0 added 36 cases to the data set and removed two cases from the v2.0.0 data set. This update also added actor coding for 46 coup events and added executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event.
Version 2.0.0 improved several aspects of the previous version (v1.0.0) and incorporated additional source material to include:
⢠Reconciling missing event data
⢠Removing events with irreconcilable event dates
⢠Removing events with insufficient sourcing (each event needs at least two sources)
⢠Removing events that were inaccurately coded as coup events
⢠Removing variables that fell below the threshold of inter-coder reliability required by the project
⢠Removing the spreadsheet âCoupInventory.xlsâ because of inadequate attribution and citations in the event summaries
⢠Extending the period covered from 1945-2005 to 1945-2019
⢠Adding events from Powell and Thyneâs Coup Data (Powell and Thyne, 2011)
Version 1.0.0 was released in 2013. This version consolidated coup data taken from the following sources:
⢠The Center for Systemic Peace (Marshall and Marshall, 2007)
⢠The World Handbook of Political and Social Indicators (Taylor and Jodice, 1983)
⢠Coup dâĂtat: A Practical Handbook (Luttwak, 1979)
⢠The Cline Centerâs Social, Political and Economic Event Database (SPEED) Project (Nardulli, Althaus and Hayes, 2015)
⢠Government Change in Authoritarian Regimes â 2010 Update (Svolik and Akcinaroglu, 2006)
<br>
<b>Items in this Dataset</b>
1. <i>Cline Center Coup d'Ătat Codebook v.2.2.2 Codebook.pdf</i> - This 18-page document describes the Cline Center Coup dâĂtat Project dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup dâĂŠtat used by the Coup d'Ătat Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised February 2026</i>
2. <i>Coup Data 2.2.2.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup dâĂtat Project. It contains 29 variables and 1,161 observations. <i>Revised February 2026</i>
3. <i>Source Document v2.2.2.pdf</i> - This 365-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised February 2026</i>
4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in Markdown language. <i>Revised February 2026</i>
<br>
<b> Citation Guidelines</b>
1. To cite the codebook (or any other documentation associated with the Cline Center Coup dâĂtat Project Dataset) please use the following citation:
Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2026. âCline Center Coup dâĂtat Project Dataset Codebookâ. Cline Center Coup dâĂtat Project Dataset. Cline Center for Advanced Social Research. V.2.2.2. February 17. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V10
2. To cite data from the Cline Center Coup dâĂtat Project Dataset please use the following citation (filling in the correct date of access):
Peyton, Buddy, Joseph Bajjalieh, Michael Martin, and Andrea Gerald. 2026. Cline Center Coup dâĂtat Project Dataset. Cline Center for Advanced Social Research. V.2.2.2. February 17. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V10
published:
2026-02-11
Hanley, David; Lee, Jongwon; Choi, Su Yeon; Bretl, Timothy
(2026)
If you use this dataset, please cite both the dataset and the associated data paper (bibtex is below).
@ARTICLE{11386847,
author={Hanley, David and Lee, Jongwon and Choi, Su Yeon and Bretl, Timothy},
journal={IEEE Transactions on Instrumentation and Measurement},
title={The MagPIE2 Dataset for Mapping, Localization, and Simultaneous Localization and Mapping Using Magnetic Fields},
year={2026},
volume={},
number={},
pages={1-1},
keywords={Magnetometers;Magnetic field measurement;Magnetic fields;Pedestrians;Location awareness;Buildings;Simultaneous localization and mapping;Measurement errors;Hardware;Calibration;Localization;mapping;SLAM;dataset;benchmark;magnetometer;magnetic field},
doi={10.1109/TIM.2026.3662919}}
We present a dataset for the evaluation of magnetic field-based robotic and pedestrian localization, mapping, and SLAM methods. This dataset contains magnetometer and inertial measurement unit data collected from inside three buildings both a pedestrian and a ground robot. Data were collected at different heights simultaneously, both with and without changes in the placement of objects that may affect magnetometer measurements. In total, approximately 689 square meters of floor space was covered by this dataset.
This dataset is archivally stored. We provide a GitHub site which is meant to serve as a forum to post issues with the dataset, share code using the dataset, and to resolve problems: <a href="https://github.com/hanley6/MagPIE2Forum">https://github.com/hanley6/MagPIE2Forum</a>
Note that while the dataset is meant to be permanently stored, this forum is not meant to guarantee perennial support and its existence will be dependent on the policies of GitHub.
<b>How is the dataset organized?</b> The data is divided into the following parts at a high level and more detailed information can be found in the Readme:
1. The walking portion of the dataset: CSL_WLK.zip, DCL_WLK.zip, Talbot_WLK.zip, and WLK_Misc.zip.
2. The robot portion of the dataset: Robot_Dataset.zip.
3. Motor interference tests: Motor_Interference_Test.zip.
4. Ground truth evaluation: Ground_Truth_Evaluation.zip.
5. Quick start results: Quick_Start_Results.zip.
<b>How is data recorded and stored?</b> Data is generally collected in the form of ROS bag files. Each ROS bag has Intel Realsense camera images, magnetometer readings, IMU readings, timestamps, and more as applicable for each file in the dataset. Each bag file has an associated metadata file written as a YAML file. This contains general information about each bag file including the start and stop time, who collected the bag file (during the pedestrian portion of the dataset), and the approximate location where data was collected. In several cases, additional comma separated (csv) files of the dataset where included either as a convenient supplement to ROS bag files (e.g., csv files of magnetometer calibration data) or because they serve as human readable quick start results.
<b>How does one set up and run files on the dataset?</b> The files are stored in ROS bags and are, therefore, meant to be run using the Robot Operating System. Information regarding how to use the Robot Operating System as well as installation instructions are available at: <a href="https://ros.org/">https://ros.org/</a>
keywords:
Localization; mapping; SLAM; dataset; benchmark; magnetometer; magnetic field
published:
2025-12-23
Aly, Abdallah; A. Saif, M. Taher
(2025)
The uploaded data is part of the paper titled: Self-Modifying Percolation Governs Detachment in Soft Suction Wet Adhesion, which shows the detachment mechanism of liquid suction-based adhesion.
published:
2025-05-07
Reves, Olivia; Larson, Eric
(2025)
Data collected at 71 study sites from 2023 to 2024 for Reves, Olivia P. (2025): Using Environmental DNA Metabarcoding to Inform Biodiversity Conservation in Agricultural Landscapes. Master's thesis, University of Illinois Urbana-Champaign. Files include study site information, taxa by site matrices for vertebrates from environmental DNA metabarcoding using multiple mitochondrial DNA primers (COI, 12S), and bird species audibly detected by a phone app at study sites.
keywords:
agricultural conservation; biodiversity; eDNA; environmental DNA; Illinois; metabarcoding; riparian buffers; stream flow; vertebrates
published:
2025-02-07
Wang, Binghui; Kudeki, Erhan
(2025)
Incoherent scatter radar datasets collected during the September 2016 campaign at Arecibo have been deposited in this databank. The lag products of the ISR data are stored as lag profile matrices with 5 minutes of integration time. The data is organized in a Python dictionary format, with each file containing 12 lag profile matrices representing one hour of observation. A sample Python script is provided to illustrate its usage.
published:
2026-01-20
Willson, James; Warnow, Tandy
(2026)
Dataset from "CAMUS: Scalable Phylogenetic Network Estimation." This dataset contains simulated phylogenetic networks, gene trees, and sequence data.
- camus-dataset.tar.xz is the main archive containing all the simulated data. More details about the files and directories it contains can be found in README.md
- scripts.zip contains various scripts used in the simulation study.
keywords:
evolution; computational biology; bioinformatics; phylogenetics
published:
2026-01-21
Suthers, Patrick; Maranas, Costas
(2026)
Growth-coupling product formation can facilitate strain stability by aligning industrial objectives with biological fitness. Organic acids make up many building block chemicals that can be produced from sugars obtainable from renewable biomass. Issatchenkia orientalis is a yeast strain tolerant to acidic conditions and is thus a promising host for industrial production of organic acids. Here, we use constraint-based methods to assess the potential of computationally designing growth-coupled production strains for I. orientalis that produce 22 different organic acids under aerobic or microaerobic conditions. We explore native and engineered pathways using glucose or xylose as the carbon substrates as proxy constituents of hydrolyzed biomass. We identified growth-coupled production strategies for 37 of the substrate-product pairs, with 15 pairs achieving production for any growth rate. We systematically assess the strain design solutions and categorize the underlying principles involved.
keywords:
Bioproducts; Modeling
published:
2026-01-19
Note: The GTAP dataset includes a total of 140 regions, some of which are aggregated regions. For all map-related supplementary files (S11, S12, S13), we assign values to each individual country to enhance visualization. Countries within the same aggregated region are assigned the same regional value to maintain consistency across the map.
<b>Data S1 (separate file): S1.csv</b>- CSV file detailing production-related deaths for the GTAP dataset.
Rows: Each row represents a country where deaths occur as a result of production activities.
Columns: Each column represents a country-sector pair on the production side.
Values: The values indicate the number of deaths caused by production activities in the country-sector listed in each column and occurring in the country listed in each row.
<b>Data S2 (separate file): S2.csv</b>- CSV file detailing production-related deaths for the EORA dataset.
Structure: The file has the same structure as S1.csv.
<b>Data S3 (separate file): S3.csv</b>- CSV file detailing consumption-related deaths for the GTAP dataset.
Rows: Each row represents a country where deaths occur as a result of consumption activities.
Columns: Each column represents a consumption country.
Values: The values indicate the number of deaths caused by consumption activities in the country listed in the column and occurring in the country listed in the row.
<b>Data S4 (separate file): S4.csv</b>- CSV file detailing consumption-related deaths for the EORA dataset.
Structure: The file has the same structure as S3.csv.
<b>Data S5 (folder of files): S5.zip</b>- a folder containing 141 CSV files, each named after a country's 3-digit code (e.g., USA.csv, CHN.csv), representing production-related spatial PMâ.â
concentration patterns for all GTAP countries.
Rows: Each row corresponds to a grid cell.
Columns: Each column represents an industrial sector. The final column, "geometry," contains the spatial coordinates (latitude and longitude) for each grid cell.
Values: Each value indicates the PMâ.â
concentration level (in Âľg/mÂł) attributable to emissions from the specified sector in the given country, as they occur in each grid cell.
<b>Data S6 (folder of files): S6.zip</b>- a folder containing 188 CSV files, each named after a country's 3-digit code, representing production-related spatial PMâ.â
concentration patterns for all EORA countries.
Structure: Each file follows the same format as those in S5.zip, with rows representing grid cells and columns representing industrial sectors, plus a "geometry" column containing spatial coordinates.
<b>Data S7 (separate file): S7.csv</b>- CSV file containing consumption-related spatial PMâ.â
concentration patterns for all GTAP countries.
Rows: Each row represents a grid cell.
Columns: Apart from the last column ("geometry"), which contains spatial information for each grid cell in latitude-longitude coordinates, each column represents a consumption country.
Values: Each value indicates the PMâ.â
concentration level caused by each countryâs consumption process and occurring in each grid cell, measured in Âľg/mÂł.
<b>Data S8 (separate file): S8.csv</b>- CSV file containing consumption-related spatial PMâ.â
concentration patterns for all EORA countries.
Structure: The file has the same structure as S7.csv.
<b>Data S9 (separate file): S9.csv</b>- CSV file listing the total net bidirectional export of deaths for all countries in GTAP, displaying only positive values.
Columns:
"from": The country that exports more consumption-related deaths.
"to": The country that imports more consumption-related deaths.
"values": The net export of deaths between these two countries, calculated as the difference between the deaths flowing from "from" to "to" and those from "to" to "from."
<b>Data S10 (separate file): S10.csv</b>- CSV file listing the total net bidirectional export of deaths for all countries in EORA, displaying only positive values.
Structure: The file has the same structure as S9.csv.
<b>Data S11 (separate file): S11.csv</b>- CSV file listing the Value of Statistical Lives (VSLs), and consumption-related externalities under three scenariosâBusiness as Usual (BAU), Global Community (GC), and Fair Trade in Deaths (FTD)âalong with externalities per GDP and their differences for GTAP countries.
Columns:
VSL, BAU_Externality, GC_Externality, FTD_Externality
BAU_Ext_perGDP, GC_Ext_perGDP, FTD_Ext_perGDP
Diff_GC_BAU, Diff_FTD_BAU, Diff_FTD_GC
<b>Data S12 (separate file): S12.csv</b>- Same as S11.csv, but for EORA countries.
Structure: Identical to S11.csv.
<b>Data S13 (separate file): S13.csv</b>- purpose: Includes data used to generate Figures 1, 2, 3, and 5 in the main text.
Columns:
country_code: 3-letter country code
GTAP_region, continent, population, GDP, GDP_capita, VSL
export_of_death, import_of_death, net_export, net_export_capita
allforeign_world, G50foreign_world, G100foreign_world
cause_allforeign_world, cause_L30foreign_world, cause_L50foreign_world
BAU_Externality, GC_Externality, FTD_Externality
BAU_Ext_perGDP, GC_Ext_perGDP, FTD_Ext_perGDP
Diff_GC_BAU, Diff_FTD_BAU, Diff_FTD_GC
geometry (used for visualization)
<b>Data S14 (separate file): S14.xlsx</b>- this Excel file contains six sheets summarizing cross-model Pearson correlation coefficients between sectoral economic activity fractions and transboundary mortality impact metrics, based on both GTAP and EORA datasets.
Sheets:
Output_fraction_GTAP
Direct_demand_fraction_GTAP
Final_demand_fraction_GTAP
Output_fraction_EORA
Direct_demand_fraction_EORA
Final_demand_fraction_EORA
Rows: Each row represents an economic sector.
Columns:
G50foreign_world: Fraction of deaths attributable to final demand from regions where demand per capita is more than 50% higher than in the current country.
cause_L50foreign_world: Fraction of deaths caused by consumption within the current country but occurring in countries with more than 50% lower demand per capita.
Values: Each value represents the Pearson correlation between the sectoral fraction and the corresponding transboundary mortality metric.
<b>Data S15 (separate file): S15.csv</b>- CSV file derived from the GTAP dataset, containing Monte Carlo simulation results (500 draws) for the uncertainty analysis of production-based premature deaths.
Column Producer: The producing countryâsector pair responsible for the emissions leading to health impacts.
Column Affected Country: The country where the resulting premature deaths occur.
Column Deaths: The estimated number of deaths corresponding to the one used in the main analysis.
Columns Deaths_median, Deaths_low95, Deaths_high95: The median, 2.5th percentile, and 97.5th percentile values across 500 Monte Carlo draws of the GEMM θ parameter, representing the 95% confidence interval for each producerâaffected country pair.
<b>Data S16 (separate file): S16.csv</b>- CSV file derived from the GTAP dataset, containing Monte Carlo simulation results (500 draws) for the uncertainty analysis of consumption-based premature deaths.
Column Consumer: The consuming country whose final demand drives the global production and associated health impacts.
Column Affected Country: The country where the resulting premature deaths occur.
Column Deaths: The estimated number of deaths corresponding to the one used in the main analysis.
Columns Deaths_median, Deaths_low95, Deaths_high95: The median, 2.5th percentile, and 97.5th percentile values across 500 Monte Carlo draws of the GEMM θ parameter, representing the 95% confidence interval for each consumerâaffected country combination.
published:
2025-09-18
Chen, Maosi; Parton, William J.; Hartman, Melannie D.; Del Grosso, Stephen J.; Smith, William K.; Knapp, Alan; Lutz, Susan; Derner, Justin; Tucker, Compton; Ojima, Dennis; Volesky, Jerry; Stephenson, Mitchell B.; Schacht, Walter H.; Gao, Wei
(2025)
Productivity throughout the North American Great Plains grasslands is generally considered to be water limited, with the strength of this limitation increasing as precipitation decreases. We hypothesize that cumulative actual evapotranspiration water loss (AET) from April to July is the precipitationârelated variable most correlated to aboveground net primary production (ANPP) in the U.S. Great Plains (GP). We tested this by evaluating the relationship of ANPP to AET, precipitation, and plant transpiration (Tr). We used multiâyear ANPP data from five sites ranging from semiarid grasslands in Colorado and Wyoming to mesic grasslands in Nebraska and Kansas, mean annual NRCS ANPP, and satelliteâderived normalized difference vegetation index (NDVI) data. Results from the five sites showed that cumulative AprilâtoâJuly AET, precipitation, and Tr were well correlated (R2: 0.54â0.70) to annual changes in ANPP for all but the wettest site. AET and Tr were better correlated to annual changes in ANPP compared to precipitation for the drier sites, and precipitation in August and September had little impact on productivity in drier sites. AprilâtoâJuly cumulative precipitation was best correlated (R2 = 0.63) with interannual variability in ANPP in the most mesic site, while AET and Tr were poorly correlated with ANPP at this site. Cumulative growing season (MayâtoâSeptember) NDVI (iNDVI) was strongly correlated with annual ANPP at the five sites (R2 = 0.90). Using iNDVI as a surrogate for ANPP, we found that countyâlevel cumulative AprilâJuly AET was more strongly correlated to ANPP than precipitation for more than 80% of the GP counties, with precipitation tending to perform better in the eastern more mesic portion of the GP. Including the ratio of AET to potential evapotranspiration (PET) improved the correlation of AET to both iNDVI and mean countyâlevel NRCS ANPP. Accounting for how different precipitationârelated variables control ANPP (AET in drier portion, precipitation in wetter portion) provides opportunity to develop spatially explicit forecasting of ANPP across the GP for enhancing decisionâmaking by land managers and use of grassland ANPP for biofuels.
keywords:
Sustainability;Field Data;Modeling
published:
2026-01-12
Yan, Qiang; Cordell, William; Jindra, Michael; Pfleger, Brian
(2026)
Microbial lipid metabolism is an attractive route for producing oleochemicals. The predominant strategy centers on heterologous thioesterases to synthesize desired chain-length fatty acids. To convert acids to oleochemicals (e.g., fatty alcohols, ketones), the narrowed fatty acid pool needs to be reactivated as coenzyme A thioesters at cost of one ATP per reactivation â an expense that could be saved if the acyl-chain was directly transferred from ACP- to CoA-thioester. Here, we demonstrate such an alternative acyl-transferase strategy by heterologous expression of PhaG, an enzyme first identified in Pseudomonads, that transfers 3-hydroxy acyl-chains between acyl-carrier protein and coenzyme A thioester forms for creating polyhydroxyalkanoate monomers. We use it to create a pool of acyl-CoAâs that can be redirected to oleochemical products. Through bioprospecting, mutagenesis, and metabolic engineering, we develop three strains of Escherichia coli capable of producing over 1âg/L of medium-chain free fatty acids, fatty alcohols, and methyl ketones.
keywords:
Bioproducts; Metabolomics
published:
2025-10-22
Yan, Qiang; Jacobson, Tyler B.; Ye, Zhou; Cortes-PeĂąa, Yoel R.; Bhagwat, Sarang; Hubbard, Susan; Cordell, William T.; Oleniczak, Rebecca E.; Gambacorta, Francesca V.; Rivera-Vasquez, Julio; Shusta, Eric V.; Amador-Noguez, Daniel; Guest, Jeremy; Pfleger, Brian
(2025)
Plants produce many high-value oleochemical molecules. While oil-crop agriculture is performed at industrial scales, suitable land is not available to meet global oleochemical demand. Worse, establishing new oil-crop farms often comes with the environmental cost of tropical deforestation. The field of metabolic engineering offers tools to transplant oleochemical metabolism into tractable hosts while simultaneously providing access to molecules produced by non-agricultural plants. Here, we evaluate strategies for rewiring metabolism in the oleaginous yeast Yarrowia lipolytica to synthesize a foreign lipid, 3-acetyl-1,2-diacyl-sn-glycerol (acTAG). Oils made up of acTAG have a reduced viscosity and melting point relative to traditional triacylglycerol oils making them attractive as low-grade diesels, lubricants, and emulsifiers. This manuscript describes a metabolic engineering study that established acTAG production at g/L scale, exploration of the impact of lipid bodies on acTAG titer, and a techno-economic analysis that establishes the performance benchmarks required for microbial acTAG production to be economically feasible.
keywords:
Conversion;Sustainability;Biomass Analytics;Lipidomics;Metabolomics
published:
2025-11-03
Woodruff, William; Deshavath, Narendra Naik; Susanto, Vionna; Rao, Christopher V.; Singh, Vijay
(2025)
Oleaginous yeasts are a promising candidate for the sustainable conversion of lignocellulosic feedstocks into fuels and chemicals, but their growth on these substrates can be inhibited as a result of upstream pretreatment and enzymatic hydrolysis conditions. Previous studies indicate a high citrate buffer concentration during hydrolysis inhibits downstream cell growth and ethanol fermentation in Saccharomyces cerevisiae. In this study, an engineered Rhodosporidium toruloides strain with enhanced lipid accumulation was grown on sorghum hydrolysate with high and low citrate buffer concentrations. Both hydrolysis conditions resulted in similar sugar recovery rates and concentrations. No significant differences in cell growth, sugar utilization rates, or lipid production rates were observed between the two citrate buffer conditions during batch fermentation of R. toruloides. Under fed-batch growth on low-citrate hydrolysate a lipid titer of 16.7 g/L was obtained. Citrate buffer was not found to inhibit growth or lipid production in this engineered R. toruloides strain, nor did reducing the citrate buffer concentration negatively affect sugar yields in the hydrolysate. As this process is scaled-up, $131 per ton of hydrothermally pretreated biomass can be saved by use of the lower citrate buffer concentration during enzymatic hydrolysis.
keywords:
Conversion;Hydrolysate;Lipidomics
published:
2025-10-15
York, Julia M.; Bhat, Shriram; Kim, Jinmu; Cardenas, Leyla; Cheng, Chi-Hing Christina
(2025)
This repository contains supplementary information, alternate genome assemblies, annotation, and predicted protein datasets for Notothenia coriiceps and Paranotothenia angustata genome assemblies. Primary assemblies, mitochondrial assemblies, RNA-Seq data, and raw read data can be found under NCBI Bioproject PRJNA1310647.
keywords:
notothenioid; Antarctic; fish; genome; DNA
published:
2025-10-16
Maitra, Shraddha; Long, Stephen P.; Singh, Vijay
(2025)
Transgenic bioenergy crops have shown the potential to produce vegetative oil by accumulating energy-rich triacylglyceride molecules that can be converted into biofuels (biodiesel and biojet). These transgenic crops cater to improved biofuel yield by providing lipids along with cellulosic sugars. Efficient bioprocessing technologies are needed to utilize these transgenic plants to their maximum potential. To this end, this study investigates a low- and high-severity chemical-free hydrothermal pretreatment of transgenic oilcane 1566 bagasse with in situ lipids to maximize the recovery of lipids for biodiesel and fermentable sugars for ethanol with minimal inhibitor generation. Hydrothermal pretreatment at 170°C recovered âź25% of total lipids in the pretreatment liquor, leaving the remainder in bagasse residue for hexane recovery post fermentation. The recovery of lipids in pretreatment liquor remained constant beyond 170°C. Along with lipids, âź35% w/w and âź50% w/w fermentable sugars were recovered post saccharification from bagasse pretreated at 170°C and 210°C for 20 min, respectively. Hydrothermal pretreatment at 170°C for 20 min provided the optimum conditions for maximum recovery of lipids and cellulosic sugars that resulted in enhanced biofuel yield per unit biomass. High severity pretreatment increased the generation of inhibitors beyond the tolerance of fermentation microorganisms. In addition, the application of time-domain proton NMR spectroscopy was extended to bioprocessing. NMR technology facilitated the analysis of total lipids, the composition of fatty acids, and the characterization of free and bound lipids in untreated and pretreated oilcane 1566 bagasse subsequent to each step of biomass to biofuel conversion.
keywords:
Conversion;Feedstock Bioprocessing
published:
2025-11-03
Banerjee, Shivali; Dien, Bruce; Eilts, Kristen; Sacks, Erik; Singh, Vijay
(2025)
Chemical-free hydrothermal pretreatment of Miscanthus x giganteus (Mxg) at the lab scale using high liquid-to-solid ratios resulted in the recovery of anthocyanins and enhanced enzymatic digestibility of residual biomass. In this study, the process is scaled up by using a continuous hydrothermal pretreatment reactor operated at a low liquid-to-solid ratio (50 % w/w solids) as an important step towards commercialization. Anthocyanin yield was 70 % w/w at the pilot scale (50 kg of Mxg), compared to the 94 % w/w yield achieved at the lab scale (0.5 g of Mxg). The pretreated biomass was subsequently refined mechanically using a disc mill to increase the accessibility of cellulose by cellulases. Enzymatic saccharification of the pretreated and disc-milled residue yielded 238 g/L sugar concentration by operating in fed-batch mode at 50 % w/v solids content. Two strains of Rhodosporidium toruloides were evaluated for converting the hydrolysate sugars into microbial lipids, and strain Y-6987 had the highest lipid titer (11.0 g/L). Further, the residue left after enzymatic saccharification was determined to be enriched 1.7-fold in the lignin content. This lignin-rich residue has value as a feedstock for the production of sustainable aviation fuel precursors and other high-value lignin-based chemicals. Hence the proposed biorefinery based on Mxg creates an opportunity for generating revenue from multiple high-value products. As the demand for biofuels and biobased products is rising, the biorefinery products from Mxg would create a niche in the industrial sector.
keywords:
Conversion;Feedstock Production;Feedstock Bioprocessing;Hydrolysate;Lipidomics
published:
2025-11-12
Fan, Xinxin; Khanna, Madhu; Lee, Yuanyao; Kent, Jeffrey; Shi, Rui; Guest, Jeremy; Lee, DoKyoung
(2025)
Cellulosic biomass-based sustainable aviation fuels (SAFs) can be produced from various feedstocks. The breakeven price and carbon intensity of these feedstock-to-SAF pathways are likely to differ across feedstocks and across spatial locations due to differences in feedstock attributes, productivity, opportunity costs of land for feedstock production, soil carbon effects, and feedstock composition. We integrate feedstock to fuel supply chain economics and life-cycle carbon accounting using the same system boundary to quantify and compare the spatially varying greenhouse gas (GHG) intensities and costs of GHG abatement with SAFs derived from four feedstocks (switchgrass, miscanthus, energy sorghum, and corn stover) at 4 km resolution across the U.S. rainfed region. We show that the optimal feedstock for each location differs depending on whether the incentive is to lower breakeven price, carbon intensity, or cost of carbon abatement with biomass or to have high biomass production per unit land. The cost of abating GHG emissions with SAF ranges from $181 Mgâ1 CO2e to more than $444 Mgâ1 CO2e and is lowest with miscanthus in the Midwest, switchgrass in the south, and energy sorghum in a relatively small region in the Great Plains. While corn stover-based SAF has the lowest breakeven price per gallon, it has the highest cost of abatement due to its relatively high GHG intensity. Our findings imply that different types of policies, such as volumetric targets, tax credits, and low carbon fuel standards, will differ in the mix of feedstocks they incentivize and locations where they are produced in the U.S. rainfed region.
<b>Note: Column V in TableS7_DayCentSimulatedYield.csv should be labelled Corn Stover CoSo-NT-50% Max.</b>
keywords:
Sustainability;Geospatial;Modeling
published:
2025-09-30
Yun, Danim; Ayla, E. Zeynep; Bregante, Daniel T.; Flaherty, David W.
(2025)
Oxidative cleavage of carbonâcarbon double bonds (CâC) in alkenes and fatty acids produces aldehydes and acids valued as chemical intermediates. Solid tungsten oxide catalysts are low cost, nontoxic, and selective for the oxidative cleavage of CâC bonds with hydrogen peroxide (H2O2) and are, therefore, a promising option for continuous processes. Despite the relevance of these materials, the elementary steps involved and their sensitivity to the form of W sites present on surfaces have not been described. Here, we combine in situ spectroscopy and rate measurements to identify significant steps in the reaction and the reactive species present on the catalysts and examine differences between the kinetics of this reaction on isolated W atoms grafted to alumina and on those exposed on crystalline WO3 nanoparticles. Raman spectroscopy shows that Wâperoxo complexes (Wâ(Ρ2-O2)) formed from H2O2 react with alkenes in a kinetically relevant step to produce epoxides, which undergo hydrolysis at protic surface sites. Subsequently, the CH3CN solvent deprotonates diols to form alpha-hydroxy ketones that react to form aldehydes and water following nucleophilic attack of H2O2. Turnover rates for oxidative cleavage, determined by in situ site titrations, on WOxâAl2O3 are 75% greater than those on WO3 at standard conditions. These differences reflect the activation enthalpies (ÎHâĄ) for the oxidative cleavage of 4-octene that are much lower than those for the isolated WOx sites (36 Âą 3 and 60 Âą 6 kJ¡molâ1 for WOxâAl2O3 and WO3, respectively) and correlate strongly with the difference between the enthalpies of adsorption for epoxyoctane (ÎHads,epox), which resembles the transition state for epoxidation. The WOxâAl2O3 catalysts mediate oxidative cleavage of oleic acid with H2O2 following a mechanism comparable to that for the oxidative cleavage of 4-octene. The WO3 materials, however, form only the epoxide and do not cleave the CâC bond or produce aldehydes and acids. These differences reflect the distinct site requirements for these reaction pathways and indicate that acid sites required for diol formation are strongly inhibited by oleic acids and epoxides on WO3 whereas the Al2O3 support provides sites competent for this reaction and increase the yield of the oxidative cleavage products.
keywords:
Catalysis;Conversion
published:
2025-11-03
Kim, Min Soo; Choi, Dasol; Ha, Jihyo; Choi, Kyuhyeok; Yu, Jae-Hyuk; Dumesic, James; Huber, George
(2025)
This study shows a new route to produce potassium sorbate (KS) from triacetic acid lactone (TAL), which is a chemical platform that can be biologically synthesized from natural sources. Sorbic acid and its potassium salt (KS) are widely used as preservatives in foods and pharmaceuticals. Three steps are used to produce KS from TAL: 1) hydrogenation of TAL into 4-hydroxy-6-methyltetrahydro-2-pyrone (HMP), 2) dehydration of HMP to parasorbic acid (PSA), and 3) ring-opening and hydrolysis of PSA to KS. TAL can be fully hydrogenated over Ni/SiO2 to give near quantitative yields of HMP. A three-step reaction kinetics model was developed for dehydration of HMP into PSA. This model was used to show that the highest PSA yield occurs at low temperatures. An experimental PSA yield of 84.2% with respect to TAL was obtained which agreed with the prediction of the reaction kinetics model. KOH was used as a coreactant for the ring-opening hydrolysis of PSA to produce >99.9% yield of KS from PSA. Tetrahydrofuran (THF) was used to purify the TAL derived-KS (TAL-KS). The TAL-KS had a KS purity of 95.5%. The overall yield of TAL-KS with respect to TAL was calculated to be 77.3%. TAL-KS produced in this study had similar antimicrobial activities as commercial KS.
keywords:
Conversion;Catalysis;Modeling
published:
2025-11-12
Santiago-Martinez, Leoncio; Li, Mengting; Munoz-Briones, Paola; Vergara Zambrano, Javiera; Avraamidou, Styliani; Dumesic, James; Huber, George
(2025)
Herein we report the production of high-pressure (19.3 bar), carbon-negative hydrogen (H2) from glycerol with a purity of 98.2 mol% H2, 1.8 mol% light hydrocarbons (mainly methane), and 400 ppm of CO. Aqueous phase reforming (APR) of 10 wt% glycerol solution was studied with a series of NiPt alumina bimetallic catalysts supported on alumina. The Ni8Pt1-450 catalyst had the highest hydrogen selectivity (95.6%) and the lowest alkanes selectivity (3.7%) of the tested catalysts. The hydrogen selectivity decreased in the order of Ni8Pt1-450 > Ni8Pt1-260 > Ni1Pt1-260 > Pt-260. The CO2 was sequestered with CaO adsorbent which formed CaCO3. We measured the adsorption capacity of the CaO adsorbent at different temperatures. Life cycle analysis showed that the APR of glycerol coupled with CO2 capture has net negative CO2 equivalent greenhouse gas emissions. The CO2 emissions are â9.9 kg CO2 eq./kg H2 and â50.1 kg CO2 eq./kg H2 when grid electricity and renewable electricity are used, respectively, and the CO2 is allocated respectively to the mass of products produced. The cost of this H2 (denoted as âgreen-emeraldâ) was estimated to be 2.4 USD per kg H2 when grid electricity is used and 2.7 USD per kg H2 when using renewable electricity. The cost of glycerol has the highest contribution of 1.71 USD per kg H2. Participation in the carbon credit markets can further decrease the price of the produced H2.
keywords:
Conversion;Catalysis
published:
2025-09-08
Si, Luyang; Salami, Malik Oyewale; Schneider, Jodi
(2025)
This work evaluates the consistency and reliability of the title flag, i.e., retraction labeling that appears in the title of retracted publications, using 925 sampled retracted publications indexed in the Crossref only (Lee & Schneider, 2023), that are indexed in three other sources, Retraction Watch, Scopus, and Web of Science as of April 2023. We presume the retraction status of an item based on its title flag. For example, the flag "removal notice" is a retraction notice, and "retracted article" is a retracted paper. We compared the item's likely retraction status from the flag with the item's actual retraction status from the publisher's website.
keywords:
Crossref; Data Quality; Title flag; Retraction flag; Retraction flag assessment; Retraction labeling; Retraction indexing; Retracted papers; Retraction notices; Retraction status; RISRS