Illinois Data Bank Dataset Search Results
Results
published:
2024-07-22
Ferguson, John; Schumuker, Peter; Dmitrieva, Anna; Quach, Truyen; Zhang, Tieling; Ge, Zhengxiang; Nersesian, Natalya; Sato, Shirley; Clemente, Thomas; Leakey, Andrew
(2024)
Raw data for the results presented in Ferguson et al 2024.
keywords:
Sorghum bicolor; stomata; stomatal conductance; C4 photosynthesis; water-use efficiency; drought
published:
2023-07-14
Schneider, Jodi; Das, Susmita; Léveillé, Jacqueline ; Proescholdt, Randi
(2023)
Data for Post-retraction citation: A review of scholarly research on the spread of retracted science
Schneider, Jodi; Das, Susmita; Léveillé, Jacqueline; Proescholdt, Randi
Contact: Jodi Schneider jodi@illinois.edu & jschneider@pobox.com
**********
OVERVIEW
**********
This dataset provides further analysis for an ongoing literature review about post-retraction citation.
This ongoing work extends a poster presented as:
Jodi Schneider, Jacqueline Léveillé, Randi Proescholdt, Susmita Das, and The RISRS Team. Characterization of Publications on Post-Retraction Citation of Retracted Articles. Presented at the Ninth International Congress on Peer Review and Scientific Publication, September 8-10, 2022 hybrid in Chicago. https://hdl.handle.net/2142/114477 (now also in https://peerreviewcongress.org/abstract/characterization-of-publications-on-post-retraction-citation-of-retracted-articles/ )
Items as of the poster version are listed in the bibliography 92-PRC-items.pdf.
Note that following the poster, we made several changes to the dataset (see changes-since-PRC-poster.txt). For both the poster dataset and the current dataset, 5 items have 2 categories (see 5-items-have-2-categories.txt).
Articles were selected from the Empirical Retraction Lit bibliography (https://infoqualitylab.org/projects/risrs2020/bibliography/ and https://doi.org/10.5281/zenodo.5498474 ). The current dataset includes 92 items; 91 items were selected from the 386 total items in Empirical Retraction Lit bibliography version v.2.15.0 (July 2021); 1 item was added because it is the final form publication of a grouping of 2 items from the bibliography: Yang (2022) Do retraction practices work effectively? Evidence from citations of psychological retracted articles http://doi.org/10.1177/01655515221097623
Items were classified into 7 topics; 2 of the 7 topics have been analyzed to date.
**********************
OVERVIEW OF ANALYSIS
**********************
DATA ANALYZED:
2 of the 7 topics have been analyzed to date:
field-based case studies (n = 20)
author-focused case studies of 1 or several authors with many retracted publications (n = 15)
FUTURE DATA TO BE ANALYZED, NOT YET COVERED:
5 of the 7 topics have not yet been analyzed as of this release:
database-focused analyses (n = 33)
paper-focused case studies of 1 to 125 selected papers (n = 15)
studies of retracted publications cited in review literature (n = 8)
geographic case studies (n = 4)
studies selecting retracted publications by method (n = 2)
**************
FILE LISTING
**************
------------------
BIBLIOGRAPHY
------------------
92-PRC-items.pdf
------------------
TEXT FILES
------------------
README.txt
5-items-have-2-categories.txt
changes-since-PRC-poster.txt
------------------
CODEBOOKS
------------------
Codebook for authors.docx
Codebook for authors.pdf
Codebook for field.docx
Codebook for field.pdf
Codebook for KEY.docx
Codebook for KEY.pdf
------------------
SPREADSHEETS
------------------
field.csv
field.xlsx
multipleauthors.csv
multipleauthors.xlsx
multipleauthors-not-named.csv
multipleauthors-not-named.xlsx
singleauthors.csv
singleauthors.xlsx
***************************
DESCRIPTION OF FILE TYPES
***************************
BIBLIOGRAPHY (92-PRC-items.pdf) presents the items, as of the poster version. This has minor differences from the current data set. Consult changes-since-PRC-poster.txt for details on the differences.
TEXT FILES provide notes for additional context. These files end in .txt.
CODEBOOKS describe the data we collected. The same data is provided in both Word (.docx) and PDF format.
There is one general codebook that is referred to in the other codebooks: Codebook for KEY lists fields assigned (e.g., for a journal or conference). Note that this is distinct from the overall analysis in the Empirical Retraction Lit bibliography of fields analyzed; for that analysis see Proescholdt, Randi (2021): RISRS Retraction Review - Field Variation Data. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2070560_V1
Other codebooks document specific information we entered on each column of a spreadsheet.
SPREADSHEETS present the data collected. The same data is provided in both Excel (.xlsx) and CSV format.
Each data row describes a publication or item (e.g., thesis, poster, preprint).
For column header explainations, see the associated codebook.
*****************************
DETAILS ON THE SPREADSHEETS
*****************************
field-based case studies
CODEBOOK: Codebook for field
--REFERS TO: Codebook for KEY
DATA SHEET: field
REFERS TO: Codebook for KEY
--NUMBER OF DATA ROWS: 20 NOTE: Each data row describes a publication/item.
--NUMBER OF PUBLICATION GROUPINGS: 17
--GROUPED PUBLICATIONS: Rubbo (2019) - 2 items, Yang (2022) - 3 items
author-focused case studies of 1 or several authors with many retracted publications
CODEBOOK: Codebook for authors
--REFERS TO: Codebook for KEY
DATA SHEET 1: singleauthors (n = 9)
--NUMBER OF DATA ROWS: 9
--NUMBER OF PUBLICATION GROUPINGS: 9
DATA SHEET 2: multipleauthors (n = 5
--NUMBER OF DATA ROWS: 5
--NUMBER OF PUBLICATION GROUPINGS: 5
DATA SHEET 3: multipleauthors-not-named (n = 1)
--NUMBER OF DATA ROWS: 1
--NUMBER OF PUBLICATION GROUPINGS: 1
*********************************
CRediT <http://credit.niso.org>
*********************************
Susmita Das: Conceptualization, Data curation, Investigation, Methodology
Jaqueline Léveillé: Data curation, Investigation
Randi Proescholdt: Conceptualization, Data curation, Investigation, Methodology
Jodi Schneider: Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Project administration, Supervision
keywords:
retraction; citation of retracted publications; post-retraction citation; data extraction for scoping reviews; data extraction for literature reviews;
published:
2022-12-11
The data are original electron micrographs from the lab of the late Dr. Burt Endo of the USDA. These data were digitized from photographic prints and glass plate negatives at 600 DPI as 16 bit TIFF files. This fourth version added 6 new ZIP files from the Endo data collection. "Endo folder database.xlsx" is updated to reflect the addition. Information in "Readme_FileNameFormatting.docx" remains the same as in V3.
keywords:
Heterodera glycines; Meloidogyne incognita; Burt Endo; nematode
published:
2022-01-30
Bakken, George; Tillman, Francis; O'Keefe, Joy
(2022)
This dataset contains temperature measurements in four different bat box designs deployed in central Indiana, USA from May to September 2018. Hourly environmental data (temperature, solar radiation, and wind speed) are also included for days and hours sampled. Bat box temperature data were used as inputs in a free program, GNU Octave, to assess design performance with respect to suitability indices for endothermic metabolism and pup development. Scripts are included in the dataset.
keywords:
bats;thermal refuge;reproduction;conservation;bat box;microclimate
published:
2025-06-05
Guan, Yingjun; Fang, Liri
(2025)
There are two files in this dataset.
File1: AffiNorm
AffiNorm contains 1,001 rows, including one header row, randomly sampled from MapAffil 2018 Dataset ([**https://doi.org/10.13012/B2IDB-2556310_V1**](https://databank.illinois.edu/datasets/IDB-2556310)). Each row in the file corresponds to a particular author on a particular PubMed record, and contains the following 26 columns, comma-delimited. All columns are ASCII, except city which contains Latin-1.
COLUMN DESCRIPTION
1. PMID: the PubMed identifier. int.
2. ORDER: the position of the author. int.
3. YEAR - The year of publication. int(4), eg: 1975.
4. affiliation - affiliation string of the author. eg: Department of Pathology, University of Chicago, Illinois 60637.
5. annotation_type: the number of institutions annotated, denoted by S, M, O, or Z, where "S" (single) indicates 1 institution was annotated; "M" (Multiple) indicates more than one institutions were annotated; "O" (Out of Vocabulary or None) indicates no institution was annotated, but an institution was apparently mentioned; "Z" indicates no institution was mentioned.
6. Institution: the standard name(s) of the annotated institution(s), according to ROR. if "S" (single institution), it is saved as a string, eg: University of Chicago; if "M", it is saved as a string that looks like a python list, eg: ['Public Health Laboratory Service'; 'Centre for Applied Microbiology and Research']; if "O" or "Z", then blank.
7. inst_type: the type of institution, according to ROR. the potential values are: education, funder, healthcare, company, archive, nonprofit, government, facility, other. An institution may have more than one type, eg: ['Education', 'Funder']
8. type_edu: TRUE if the inst_type contains "Education"; FALSE otherwise.
9. RORid: ROR identifier(s), eg: https://ror.org/05hs6h993. when multiple, the order corresponds to institution (column 6)
10. RORid_label. the standard name(s) of the annotated institution(s) according to ROR.same as institution (column 6)
11. GRIDid: GRID identifier(s). eg: grid.170205.1
12. GRIDid_label: the standard name(s) of the annotated institution(s) according to GRID. eg: University of Chicago.
13. WikiDataid: WikiData identifier(s). eg: Q131252
14. WikiDataid_label: the standard name(s) of the annotated institution(s) according to WikiData. eg: University of Chicago
15. synonyms: a comma separated list of variant names from InsVar (file 2) . format of string. eg: University of Chicago, Chicago University, U of C, UChicago, uchicago.edu, U Chicago, ...
16. MapAffil-grid: GRID from the MapAffil 2018 Dataset.
17. MapAffil-grid_label: The standard name of institution from MapAffil 2018 Dataset.
18. judge_mapA: TRUE if GRIDid (column 11) contains MapAffil-grid (column 16); FALSE otherwise.
19. MapAffiltemporal-grid: GRID from the temporal version of MapAffil, http://abel.ischool.illinois.edu/data/MapAffilTempo2018.tsv.gz
20. MapAffiltemporal-grid_label: The standard name of institution from MapAffilTemporal 2018 Dataset.
21. judge_mapT: TRUE if GRIDid (column 11) contains MapAffiltemporal-grid (column 19); FALSE otherwise.
22. RORapi_query_id: ROR from ROR api tool (query endpoint)
23. RORapi_query_id_label: The standard name of institution from ROR api tool (query endpoint). format in string.
24. judge_rorapi_affiliation: TRUE if RORid (column 9) contains RORapi_query_id (column 22); FALSE otherwise.
25. rorapi_affiliation_id: ROR from ROR api tool (affiliation endpoint).
26. judge_rorapi_affiliation: TRUE if RORid (column 9) contains RORapi_affiliation (column 25); FALSE otherwise.
File 2: insVar.json
InsVar is a supplementary dataset for AffiNorm, which includes the institution ID and its redirected aliases from wikidata. The institution ID list is from GRID, the redirected aliases are from wiki api, for example: https://en.wikipedia.org/wiki/Special:WhatLinksHere?target=University+of+Illinois+Urbana-Champaign&namespace=&hidetrans=1&hidelinks=1&limit=100
In InsVar, the data is saved in a python dictionary format. the key is the GRID identifier, for example: "grid.1001.0" (Australian National University), and the value is a list of redirected aliases strings.
{"grid.1001.0": ["ANU", "ANU College", "ANU College of Arts and Social Sciences", "ANU College of Asia and the Pacific", "ANU Union", "ANUSA", "Asia Pacific Week", "Australia National University", "Australian Forestry School", "the Australian National University", ...], "grid.1002.3": ...}
keywords:
PubMed; MEDLINE; Digital Libraries; Bibliographic Databases; Institution Names; Author Affiliations; Institution Name Ambiguity; Authority files
published:
2024-07-31
LaBonte, Nicholas R.; Zerpa-Catanho, Dessiree P.; Liu, Siyao; Xiao, Liang; Dong, Hongxu; Clark, Lindsay V.; Sacks, Erik J.
(2024)
This dataset contains all data and supplementary materials from "Improving precision and accuracy of genetic mapping with genotyping-by-sequencing data in outcrossing species". An Excel file a list of all QTLs and linkage group length (in cM) obtained with two different SNP-calling methods (Tassel-Uneak and Tassel-GBS), genetic map-construction method (linkage-only and reference order-corrected) and depth filters (12x, 20x, 30x and 40x) for genetic mapping of 18 biomass yield traits in a biparental Miscanthus sinensis population using RAD-Seq SNPs is provided as "Supplementary file 1". A Perl script with the code for filtering VCF and HapMap-formatted data files is provided as “Supplementary file 2”. Phenotype data used for QTL mapping is provided as “Supplementary File 3”. A Perl script with the code for the simulation study is provided as “Supplementary file 4”.
keywords:
HapMapParser; GenotypingSimulator
published:
2025-04-25
Tassitano, Rafael; Chakraborty, Shreyonti
(2025)
This is an Excel file containing data about the physical environments of four Brazilian schools and the average daily minutes/day of physical activity and sedentary behavior exhibited by schoolchildren during school hours.
The Following Key describes the basic variables:
Subject IDs and Characteristics
Subject_ID: ID of Subject
total_days: Total number of days subject participated in experiment
Gender : Gender of subject
Age: Age of subject
School IDs and Characteristics
ID_School = ID of School
school1 = 1 if ID_School = 1, else = 0
school2 = 1 if ID_School = 2, else = 0
school3 = 1 if ID_School = 3, else = 0
school4 = 1 if ID_School = 4, else = 0
TotalSiteArea: Total Site Area on School Campus
PatioArea: Area of Patio(s)
CourtyardArea: Area of Courtyard(s)
TotalOpenArea: Total Area of Open Spaces on Campus
Class: Number of Sections in the School
Population: Total Number of Students Enrolled in the School
keywords:
school environment; physical activity
published:
2025-09-10
Lu, Yi; Mirts, Evan; Petrik, Igor D.; Hosseinzadeh, Parisa; Nilges, Mark J.
(2025)
Enzymatic reduction of oxyanions such as sulfite (SO32−) requires the delivery of multiple electrons and protons, a feat accomplished by cofactors tailored for catalysis and electron transport. Replicating this strategy in protein scaffolds may expand the range of enzymes that can be designed de novo. Mirts et al. selected a scaffold protein containing a natural heme cofactor and then engineered a cavity suitable for binding a second cofactor—an iron-sulfur cluster (see the Perspective by Lancaster). The resulting designed enzyme was optimized through rational mutation into a catalyst with spectral characteristics and activity similar to that of natural sulfite reductases.
keywords:
Conversion;Catalysis
published:
2025-10-21
Jia, Yuyao; Maitra, Shraddha; Singh, Vijay
(2025)
Bioenergy crops have potential for being a sustainable and renewable feedstock for biofuels and various value-added bioproducts. The study utilizes recently developed transgenic sugarcane (“oilcane”) bagasse for chemical-free coproduction of high-value bioproducts, i.e., furfurals, HMF, acetic acid, cellulosic sugars, and vegetative lipids. Hydrothermal pretreatment was optimized at 210 °C for 5 min to coproduce 6.91%, 2.67%, 5.07%, 2.42% and 37.82% (w/w) furfurals, HMF, acetic acid, vegetative lipids, and cellulosic sugars, respectively from lignocellulosic biomass. Additionally, nanofiltration system in-series was successfully established to recover sugars, furfurals, HMF, and acetic acid from the pretreatment liquor. 1st nanofiltration with Duracid NF membrane rejected ∼99% sugars. Concentrated sugars with significantly reduced inhibitory products were obtained in retentate for fermentation. 2nd nanofiltration with NF90 membrane used permeate of 1st nanofiltration as feed and rejected ∼ 86% furfurals. The work demonstrates the feasibility of coproducing and recovering multiple biochemicals from lignocellulosic biomass.
keywords:
Conversion;Biomass Analytics;Hydrolysate;Metabolomics
published:
2024-08-29
Li, Shuai; Montes, Christopher; Aspray, Elise; Ainsworth, Elizabeth
(2024)
Over the past 15 years, soybean seed yield response to season-long elevated O3 concentrations [O3] and to year-to-year weather conditions was studied using free-air O3 concentration enrichment (O3-FACE) in the field at the SoyFACE facility in Central Illinois. Elevated [O3] significantly reduced seed yield across cultivars and years. However, our results quantitatively demonstrate that weather conditions, including soil water availability and air temperature, did not alter yield sensitivity to elevated [O3] in soybean.
keywords:
drought, elevated O3, heat, O3-FACE, soybean, yield
published:
2021-05-14
This is the complete dataset for the "Anomalous density fluctuations in a strange metal" Proceedings of the National Academy of Sciences publication (https://doi.org/10.1073/pnas.1721495115). This is an integration of the Zenodo dataset which includes raw M-EELS data.
<b>METHODOLOGICAL INFORMATION</b>
1. Description of methods used for collection/generation of data: Data have been collected with a M-EELS instrument and according to the data acquisition protocol described in the original PNAS publication and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026)
2. Methods for processing the data: Raw data were collected with a channeltron-based M-EELS apparatus described in the reference PNAS publication and analyzed according to the procedure outlined both in the PNAS paper and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026). The raw M-EELS spectra at each momentum have been subject to minor data processing involving:
(a) averaging of different acquisitions at the same conditions,
(b) energy binning,
(c) division of an effective Coulomb matrix element (which yields a structure factor S(q,\omega)),
(d) antisymmetrization (which yields the imaginary chi)
All these procedures are described in the PNAS paper.
3. Instrument- or software-specific information needed to interpret the data: These data are simple .txt or .dat files which can be read with any standard data analysis software, notably Python notebooks, MatLab, Origin, IgorPro, and others. We do not include scripts in order to provide maximum flexibility.
4. Relationship between files, if important: We divided in different folders raw data, structure factors and imaginary chi.
<b>DATA-SPECIFIC INFORMATION</b>
There are 8 folders within the Data_public_deposition_v1.zip. Each folder contain data needed to create the corresponding figure in the publication.
<b>1. Fig1:</b> This folder contains 21 DAT files needed to plot the theory data in panels C and D, following this naming conventions:
[chiA]or[chiB]or[Pi]_q_number.dat
With chiA is the imaginary RPA charge susceptibility with a Coulomb interaction of electronically weakly coupled layers
chiB is the imaginary RPA charge susceptibility with the usual 4\pi e^2/q^2 Coulomb interaction.
Pi is the imaginary Lindhard polarizability.
q is momentum in reciprocal lattice units
Number is the numerical momentum value in reciprocal lattice units
<b>2. Fig2:</b> Files needed to plot Fig. 2 of the PNAS paper. Contains 3 folders as listed below. The files in this folder are named following this convention: Bi2212_295K_(1,-1)_50eV_161107_q_number_2.16_avg.dat,
295K is the sample temperature
(1,-1) is the momentum direction in reciprocal lattice units
50 eV is the incident e beam energy
161107 is the start date of the experiment in yymmdd format
Q is the momentum
Number is the momentum in reciprocal lattice units
2.16 is the energy range covered by the data in eV
Avg identifies averaged data
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>3. Fig3:</b> Files needed to plot Fig. 3 of the PNAS paper. OP/ OD prefix identifies optimally doped or overdosed sample data, respectively.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>4. Fig4:</b> Files needed to plot Fig. 4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>5. FigS1:</b> Files needed to plot Fig. S1 of the PNAS paper. There are 5 files in this folder. DAT files are M-EELS data following the prior naming convention, while the two .txt files are digitized data from N. Nücker, U. Eckern, J. Fink, and P. Müller, Long-Wavelength Collective Excitations of Charge Carriers in High-Tc Superconductors, Phys. Rev. B 44, 7155(R) (1991), and K. H. G. Schulte, The interplay of Spectroscopy and Correlated Materials, Ph.D. thesis, University of Groningen (2002).
<b>6. FigS2:</b> Files needed to plot Fig. S2 of the PNAS paper.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
<b>7. FigS3:</b> Files needed to plot Fig. S3 of the PNAS paper. There are 2 files in this folder:
20K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 20 K
295K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 295 K
<b>8. FigS4:</b> Files needed to plot Fig. S4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta.
ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor
Raw_avg_data: raw averaged M-EELS spectra
Sqw: Structure factors derived from the M-EELS spectra
keywords:
Momentum resolved electron energy loss spectroscopy (M-EELS); cuprates; plasmons; strange metal
published:
2025-10-14
Jagtap, Sujit Sadashiv; Deewan, Anshu; Liu, Jing-Jing; Walukiewicz, Hanna E.; Yun, Eun Ju; Jin, Yong-Su; Rao, Christopher V.
(2025)
Rhodosporidium toruloides is an oleaginous yeast capable of producing a variety of biofuels and bioproducts from diverse carbon sources. Despite numerous studies showing its promise as a platform microorganism, little is known about its metabolism and physiology. In this work, we investigated the central carbon metabolism in R. toruloides IFO0880 using transcriptomics and metabolomics during growth on glucose, xylose, acetate, or soybean oil. These substrates were chosen because they can be derived from plants. Significant changes in gene expression and metabolite concentrations were observed during growth on these four substrates. We mapped these changes onto the governing metabolic pathways to better understand how R. toruloides reprograms its metabolism to enable growth on these substrates. One notable finding concerns xylose metabolism, where poor expression of xylulokinase induces a bypass leading to arabitol production. Collectively, these results further our understanding of central carbon metabolism in R. toruloides during growth on different substrates. They may also help guide the metabolic engineering and development of better models of metabolism for R. toruloides.
keywords:
Conversion;Metabolomics;Transcriptomics
published:
2020-12-02
Yang, Pan; Cai, Ximing; Khanna, Madhu
(2020)
The dataset includes the survey results about farmers’ perceptions of marginal land availability and the likelihood of a land pixel being marginal based on a machine learning model trained from the survey.
Two spreadsheet files are the farmer and farm characteristics (marginal_land_survey_data_shared.xlsx), and the existing land use of marginal lands (land_use_info_sharing.xlsx).
<b>Note:</b> the blank cells in these two spreadsheets mean missing values in the survey response.
The GeoTiff file includes two bands, one the marginal land likelihood in the Midwestern states (0-1), the other the dominant reason of land marginality (0-5; 0 for farm size, 1 for growing season precipitation, 2 for root zone soil water capacity, 3 for average slope, 4 for growing season mean temperature, and 5 for growing season diurnal range of temperature). To read the data, please use a GIS software such as ArcGIS or QGIS.
keywords:
marginal land; survey
published:
2022-10-14
Zhou, Shan; Li, Jiahui; Lu, Jun; Liu, Haihua; Kim, Ji-Young; Kim, Ahyoung; Yao, Lehan; Liu, Chang; Qian, Chang; Hood, Zachary D. ; Lin, Xiaoying; Chen, Wenxiang; Gage, Thomas E. ; Arslan, Ilke; Travesset, Alex; Sun, Kai; Kotov, Nicholas A.; Chen, Qian
(2022)
This dataset is the raw data including SEM, TEM, PINEM images and FDTD simulation as well as pairwise interaction calculation results.
published:
2022-08-25
Souza-Cole, Ian; Ward, Michael; Rebecca, Mau; Jeffrey, Foster; Benson, Thomas
(2022)
Data in this publication were used to analyze the factors that influence the abundance of eastern whip-poor-wills in the Midwest and to describe the diet of this species. These data were collected in Illinois in 2019 and 2020. Procedures were approved by the Illinois Institutional Animal Care and Use Committee (IACUC), protocol no. 19006
keywords:
eastern whip-poor-will; Antrostomus vociferus; abundance; moths; nightjars; Lepidoptera; metabarcoding
published:
2025-10-14
Jia, Yuyao; Kumar, Deepak; Winkler-Moser, Jill K.; Dien, Bruce S.; Rausch, Kent D.; Tumbleson, M.E.; Singh, Vijay
(2025)
Efforts to engineer high-productivity crops to accumulate oils in their vegetative tissue present the possibility of expanding biodiesel production. However, processing the new crops for lipid recovery and ethanol production from cell wall saccharides is challenging and expensive. In a previous study using corn germ meal as a model substrate, we reported that liquid hot water (LHW) pretreatment enriched the lipid concentration by 2.2 to 4.2 fold. This study investigated combining oil recovery with ethanol production by extracting oil following LHW and simultaneous saccharification and co-fermentation (SSCF) of the biomass. Corn germ meal was again used to model the oil-bearing energy crops. Pretreated germ meal hydrolysate or solids (160 °C and 180 °C for 10 minutes) were fermented, and lipids were extracted from both the spent fermentation whole broth and fermentation solids, which were recovered by centrifugation and convective drying. Lipid contents in spent fermentation solids increased 3.7 to 5.7 fold compared to the beginning germ meal. The highest lipid yield achieved after fermentation was 36.0 mg lipid g−1 raw biomass; the maximum relative amount of triacylglycerol (TAG) was 50.9% of extracted oil. Although the fermentation step increased the lipid concentration of the recovered solids, it did not improve the lipid yields of pretreated biomass and detrimentally affected oil compositions by increasing the relative concentrations of free fatty acids.
keywords:
Conversion;Hydrolysate;Lipidomics
published:
2025-10-03
Kang, Nam Kyu; Lee, Jaewon; Ort, Donald; Jin, Yong-Su
(2025)
L-malic acid is widely used in the food, chemical, and pharmaceutical industries. Here, we report on production of malic acid from xylose, the second most abundant sugar in lignocellulosic hydrolysates, by engineered Saccharomyces cerevisiae. To enable malic acid production in a xylose-assimilating S. cerevisiae, we overexpressed PYC1 and PYC2, coding for pyruvate carboxylases, a truncated MDH3 coding for malate dehydrogenase, and SpMAE1, coding for a Schizosaccharomyces pombe malate transporter. Additionally, both the ethanol- and glycerol-producing pathways were blocked to enhance malic acid production. The resulting strain produced malic acid from both glucose and xylose, but it produced much higher titers of malic acid from xylose than glucose. Interestingly, the engineered strain had higher malic acid yield from lower concentrations (10 g L‒1) of xylose, with no ethanol production, than from higher xylose concentrations (20 and 40 g L‒1). As such, a fed-batch culture maintaining xylose concentrations at low levels was conducted and 61.2 g L‒1 of malic acid was produced, with a productivity of 0.32 g L‒1 h. These results represent successful engineering of S. cerevisiae for the production of malic acid from xylose, confirming that that xylose offers the efficient production of various biofuels and chemicals by engineered S. cerevisiae.
keywords:
Conversion;Feedstock Production;Genome Engineering
published:
2025-11-03
von Haden, Adam C.; Eddy, William; Burnham, Mark B.; Brzostek, Edward; Yang, Wendy; DeLucia, Evan H.
(2025)
Root exudation is a key process for plant nutrient acquisition, but the controls on root exudation and its relationship to soil C and N processes in agroecosystems are unclear. We hypothesized that root exudation rates would be related to root morphological traits, N fertilization, and soil moisture. We also anticipated that root exudation would be correlated with bulk soil enzyme activity. Root exudation, root traits, and bulk soil extracellular enzyme activity were assessed in maize (Zea mays L.), soybean (Glycine max (L.) Merr.), biomass sorghum (Sorghum bicolor (L.) Moench), giant miscanthus (Miscanthus × giganteus), and switchgrass (Panicum virgatum L.). Measurements were taken in situ during two growing seasons with contrasting precipitation regimes, and N fertilization rate was varied in sorghum during one year. Specific root exudation (per unit root surface area) was negatively related to root diameter and was generally higher in annuals than perennials. Sorghum N fertilization did not affect root exudation rates, and soil moisture regime had no effect on annual root exudation rates within maize, sorghum, and miscanthus. Specific root exudation was negatively related to bulk soil C- and N-degrading soil enzyme activities. Intrinsic plant characteristics appeared more important than environmental variables in controlling in situ root exudation rates. The relationships between root diameter, root exudation, and soil C and N processes link root morphological traits to soil functions and demonstrate the potential tradeoffs among plant nutrient acquisition strategies in agroecosystems.
keywords:
Sustainability;Biomass Analytics;Field Data
published:
2021-10-27
de Jesús Astacio, Luis Miguel ; Prabhakara, Kaumudi Hassan; Li, Zeqian; Mickalide, Harry; Kuehn , Seppe
(2021)
Shared dataset consists of 16S sequencing data of microbial communities. Each community is composed of heterotrophic bacteria derived from one of two soil samples and the model algae Chlamydomonas reinhardtii. Each comunity was placed in a materially closed environment with an initial supply of carbon in the media and subjected to light-dark cycles. The closed microbial ecosystems (CES) survived via carbon cycling. Each CES was subjected to rounds of dilution, after which the community was sequenced (data provided here). The shared dataset allowed us to conclude that CES consistently self-assembled to cycle carbon (data not provided) via conserved metabolic capabilites (data not provided) dispite differences in taxonomic composition (data provided).
---------------------------
Naming convention:
[soil sample = A or B][CES replicate = 1,2,3, or 4]_[round number = 1,2,3,or 4]_[reverse read = R or forward read = F]_filt.fastq
Example -- A1_r1_F_filt.fastq means soil sample A, CES replicate 1, end of round1, forward read
keywords:
16S seq; .fastq; closed microbial ecosystems; carbon cycling
published:
2022-09-16
Zhong, Jia; Khanna, Madhu
(2022)
This dataset contains model code (including input data) to replicate the outcomes for "Assessing the Efficiency Implications of Renewable Fuel Policy Design in the United States".
The model consists of:
(1) The replication codes and data for the model. To run the model, using GAMS to run the "Models.gms" file.
keywords:
Renewable Fuel Standard; Nested structure; cellulosic waiver credit; RIN
published:
2022-09-07
Jiang, Chongya; Guan, Kaiyu; Khanna, Madhu; Chen, Luoye; Peng, Jian
(2022)
The availability of economically marginal land for energy crops is identified using the Cropland Data Layer and other soil, wind, climate data resources. All data are recognized on a 30m spatial resolution across the continental United States.
keywords:
marginal land; biofuel production; remote sensing; land use change; Cropland Data Layer
published:
2025-11-04
Berardi, Danielle; Hartman, Melannie; Brzostek, Edward; Bernacchi, Carl; DeLucia, Evan H.; von Haden, Adam C.; Kantola, Ilsa B.; Moore, Caitlin; Yang, Wendy; Hudiburg, Tara; Parton, William J.
(2025)
Globally, soils hold approximately half of ecosystem carbon and can serve as a source or sink depending on climate, vegetation, management, and disturbance regimes. Understanding how soil carbon dynamics are influenced by these factors is essential to evaluate proposed natural climate solutions and policy regarding net ecosystem carbon balance. Soil microbes play a key role in both carbon fluxes and stabilization. However, biogeochemical models often do not specifically address microbial-explicit processes. Here, we incorporated microbial-explicit processes into the DayCent biogeochemical model to better represent large perennial grasses and mechanisms of soil carbon formation and stabilization. We also take advantage of recent model improvements to better represent perennial grass structural complexity and life-history traits. Specifically, this study focuses on: 1) a plant sub-model that represents perennial phenology and more refined plant chemistry with downstream implications for soil organic matter (SOM) cycling though litter inputs, 2) live and dead soil microbe pools that influence routing of carbon to physically protected and unprotected pools, 3) Michaelis-Menten kinetics rather than first-order kinetics in the soil decomposition calculations, and 4) feedbacks between decomposition and live microbial pools. We evaluated the performance of the plant sub-model and two SOM cycling sub-models, Michaelis-Menten (MM) and first-order (FO), using observations of net ecosystem production, ecosystem respiration, soil respiration, microbial biomass, and soil carbon from long-term bioenergy research plots in the mid-western United States. The MM sub-model represented seasonal dynamics of soil carbon fluxes better than the FO sub-model which consistently overestimated winter soil respiration. While both SOM sub-models were similarly calibrated to total, physically protected, and physically unprotected soil carbon measurements, the models differed in future soil carbon response to disturbance and climate, most notably in the protected pools. Adding microbial-explicit mechanisms of soil processes to ecosystem models will improve model predictions of ecosystem carbon balances but more data and research are necessary to validate disturbance and climate change responses and soil pool allocation.
keywords:
Sustainability;Field Data;Modeling;Plant-Soil Microbiome
published:
2025-10-10
Sun, Liang; Atkinson, Christine A.; Lee, Ye-Gi; Jin, Yong-Su
(2025)
β‐Carotene is a natural pigment and health‐promoting metabolite, and has been widely used in the nutraceutical, feed, and cosmetic industries. Here, we engineered a GRAS yeast Saccharomyces cerevisiae to produce β‐carotene from xylose, the second most abundant and inedible sugar component of lignocellulose biomass. Specifically, a β‐carotene biosynthetic pathway containing crtYB, crtI, and crtE from Xanthophyllomyces dendrorhous was introduced into a xylose‐fermenting S. cerevisiae. The resulting strain produced β‐carotene from xylose at a titer threefold higher than from glucose. Interestingly, overexpression of tHMG1, which has been reported as a critical genetic perturbation to enhance metabolic fluxes in the mevalonate pathway and β‐carotene production in yeast when glucose is used, did not further improve the production of β‐carotene from xylose. Through fermentation profiling, metabolites analysis, and transcriptional studies, we found the advantages of using xylose as a carbon source, instead of glucose, for β‐carotene production to be a more respiratory feature of xylose consumption, a larger cytosolic acetyl‐CoA pool, and an upregulated expression level of rate‐limiting genes in the β‐carotene‐producing pathway, including ACS1 and HMG1. As a result, 772.8 mg/L of β‐carotene was obtained in a fed‐batch bioreactor culture with xylose feeding. Considering the inevitable large scale production of xylose when cellulosic biomass‐based bioeconomy is implemented, our results suggest xylose utilization is a promising strategy for overproduction of carotenoids and other isoprenoids in engineered S. cerevisiae.
keywords:
Conversion;Genome Engineering
published:
2025-11-10
Raj, Tirath; Dien, Bruce; Singh, Vijay
(2025)
Sugarcane is being enhanced as a bioenergy crop by engineering it to accumulate and store lipids along with polymeric sugars in vegetative tissues. However, there is no existing process that allows for processing this new crop to recover both lipid and cellulosic sugars from the oilcane bagasse. Therefore, a comprehensive investigation of two pretreatment methods—natural deep eutectic solvents (NADES) and chemical-free hydrothermal pretreatment (HT) was conducted to judge their suitability for recovering fermentable sugars, lipids, and lignin from bagasse. Two NADES, i.e., choline chloride: lactic acid (ChCl:LA) and betaine: lactic acid (BT:LA) were prepared using a 1:2 M ratio and were evaluated for pretreatment of oilcane bagasse at 10, 20, and 50 % (w/w) solids, followed by enzymatic hydrolysis at 10 % (w/w) solids. Notably, ChCl:LA NADES treatment at 10 % (w/w) solids at 140 °C for 2 h, solubilized 78.8 % of lignin and 80.4 % of hemicellulose and allowed 82.7 % enzymatic conversion of glucans to glucose. In contrast, HT pretreatment removed approximately 87.6 % of the hemicellulose and provided an enzymatic glucose yield of 69.7 %. Furthermore, ChCl:LA operated at 50 % solids loading the enriched lipids 2.6-fold (9.2 wt%) in recovered solids compared to HT (6.4 %) and BT:LA (5.1 %) pretreatment processes. NMR-HSQC and GPC analysis showed that ChCl:LA also cleaved the most lignin β–O–4 linkages and demonstrated lower molecular weight compared to HT. This study demonstrates that NADES pretreatment is an effective green processing method for recovering lipids, sugars, and lignin from bioenergy crops at high solid loading (50 % w/w) within the context of an integrated biorefinery.
keywords:
Conversion;Hydrolysate;Lipidomics
published:
2024-12-12
Varela, Sebastian; Leakey, Andrew
(2024)
This dataset supports the implementation described in the manuscript "Breaking the Barrier of Human-Annotated Training Data for Machine-Learning-Aided Biological Research Using Aerial Imagery." It comprises UAV aerial imagery used to execute the code available at https://github.com/pixelvar79/GAN-Flowering-Detection-paper. For detailed information on dataset usage and instructions for implementing the code to reproduce the study, please refer to the GitHub repository.
keywords:
Plant phenotyping; generative and adversarial learning; phenotyping; UAV; UAS, drone