Illinois Data Bank Dataset Search Results
Results
published:
2023-08-04
Zinnen, Jack; Matthews, Jeffrey W.; Zaya, David N.
(2023)
Data are provided that are relevant to the rare plant Phlox pilosa ssp. sangamonensis, or Sangamon phlox, and other members of the genus that occur in its native range. Sangamon phlox is a state-endangered subspecies that is only known to occur in two Illinois counties. Data provided come from all known Sangamon phlox populations, which we estimate as 10 separate populations. Data include genetic data from DNA microsatellite loci (allele sizes and basic summaries), flowering population size estimates, rates of fruit set, and rates of seed set. Additionally, genetic data (from microsatellites) are provided for Phlox divaricata ssp. laphamii (three populations), Phlox pilosa ssp. pilosa (two populations), and Phlox pilosa ssp. fulgida (two populations).
keywords:
Phlox; conservation genetics; microsatellites; endemism; rare plants
published:
2025-09-11
Zhang, Shuyan; Jagtap, Sujit; Deewan, Anshu; Rao, Christopher V.
(2025)
Yarrowia lipolytica has been used to produce both citric acid and lipid-based bioproducts at high titers. In this study, we found that pH differentially affects citric acid and lipid production in Y. lipolytica W29, with citric acid production enhanced at more neutral pH’s and lipid production enhanced at more acid pH’s. To determine the mechanism governing this pH-dependent switch between citric acid and lipid production, we profiled gene expression at different pH’s and found that the relative expression of multiple transporters is increased at neutral pH. These results suggest that this pH-dependent switch is mediated at the level of citric acid transport rather than changes in the expression of the enzymes involved in citric acid and lipid metabolism. In further support of this mechanism, thermodynamic calculations suggest that citric acid secretion is more energetically favorable at neutral pH’s, assuming the fully protonated acid is the substrate for secretion. Collectively, these results provide new insights regarding citric acid and lipid production in Y. lipolytica and may offer new strategies for metabolic engineering and process design.
keywords:
Conversion;RNA Sequencing;Transcriptomics
published:
2025-10-15
Blind-Doskocil, Leanne; Trapp, Robert J.; Nesbitt, Stephen W.
(2025)
This is a collection of 31 quasi-linear convective system (QLCS) mesovortices (MVs) that were first manually identified and analyzed using the lowest elevation scan of the nearest relevant Weather Surveillance Radar–1988 Doppler (WSR-88D) during the two years (springs of 2022 and 2023) of the Propagation, Evolution, and Rotation in Linear Storms (PERiLS) field campaign. This analysis was completed using the Gibson Ridge radar-viewing software (GR2Analyst). Throughout the two years of PERiLS, a total of nine intensive observing periods (IOPs) occurred (see https://catalog.eol.ucar.edu/perils_2022/missions and https://catalog.eol.ucar.edu/perils_2023/missions for exact IOP dates/times). However, only six of these IOPs (specifically, IOPs 2, 3, and 4 from both years) are included in this dataset. The inclusion criteria were based on the presence of strictly QLCS MVs that from a cursory analysis were within the C-band On Wheels (COW) domain, one of the research radars deployed in the field for the PERiLS project. The 31 QLCS MVs identified using WSR-88D data were also examined using data from the COW radar (using Solo3 software). The lowest elevation angle was not always useable in the COW data, and sometimes the second lowest elevation angle was used. Further details on how MVs were identified are provided below, and a very detailed methodology is published in Blind-Doskocil et al. (2025).
Each MV had to be produced by a QLCS, defined as a continuous area of 35 dBZ radar reflectivity over at least 100 km when viewed from the lowest elevation scan. The MVs analyzed also had to pass through/near the COW’s domain at some point during their lifetimes to allow for additional analysis using the COW data. Tornadic (TOR), wind-damaging (WD), and non-damaging (ND) MVs were analyzed over their entire lifetime and subsequently during the pretornadic, predamaging (wind damage), and prewarning phase (classified altogether as the prephase) of each MV. The prephase MVs were classified based on the first damage report or lack thereof associated with them. ND MVs were ones that usually had a tornado warning placed on them (all but one case) but did not produce any damage and persisted for five or more radar scans; this was done to target the strongest MVs that forecasters thought could be tornadic.
The QLCS MVs were identified using objective criteria, which included the existence of a circulation with a maximum differential velocity (dV; i.e., the difference between the maximum outbound and minimum inbound velocities at a constant range) of at least 20 kt over a distance ≤ 7 km. The following radar-based characteristics were catalogued for each QLCS MV at the lowest elevation angle of the nearest WSR-88D: latitude and longitude locations of the MV, the genesis to decay time of the MV, the maximum dV across the MV, the maximum rotational velocity (Vrot; i.e., dV divided by two), diameter of the MV, the range from the radar of the MV center, and the height above radar level of the MV center.
In the Excel workbook titled “nexrad_analyzed_mvs_perils_illinois_data_bank”, there are a total of 36 sheets. 31 of the 36 sheets are for each MV that was examined. The 31 MV sheets that were used to calculate MV statistics are labeled following the convention 'mv#_iop#_qlcs'. ‘mv#’ is the unique number that was assigned to each MV for clear identification, 'iop#' is the IOP in which the MV occurred, 'qlcs' denotes that the MV was produced by a QLCS, and the 2023 IOPs are denoted by ‘_2023’ after ‘qlcs’ in the sheet name. In these sheets, there are notes on what was visually seen in the radar data, damage associated with each MV (using the National Centers for Environmental Information (NCEI) database), and the characteristics of the MV at each time step of its lifetime. The yellow rows in each of the sheets indicate the last row of data included in the prephase statistics. The orange boxes in the notes column indicate any reports that were in NCEI but not in GR2Analyst. There are also sheets that examine pretornadic and predamaging diameter trends; box and whisker plot statistics of the overall characteristics of the different types of MVs; and the overall characteristics of each MV, with one Excel sheet (‘combined_qlcs_mvs’) examining the characteristics of each MV over its entire lifetime and one Excel sheet (‘combined_qlcs_mvs_before_report’) examining the characteristics of each MV before it first produced damage or had a tornado warning placed on it.
In the Excel workbook titled “cow_analyzed_mvs_perils_illinois_data_bank”, there are a total of 33 sheets. 31 of the 33 sheets are for each MV that was examined, with a similar naming convention to those analyzed using WSR-88D data. The data documented in each sheet is also similar to that in the WSR-88D sheets. Due to the very tedious and time-consuming nature of analyzing radar data manually, we mainly focused on cataloging only the times where the MVs were detectable in the COW data during the prephase. In the WSR-88D data, we examined the MVs over their entire lifetimes and during their prephases. Not all the MVs analyzed in the WSR-88D data ended up being detectable in the COW data, and we focused on comparing the prephase MVs in the COW data and WSR-88D data. Therefore, there are sheets that are missing values and note that the MV was not in the COW’s domain, not detectable during the prephase, only focused on cataloging the prephase, etc. There are also sheets that examine characteristics of each MV during the prephase (‘combined_qlcs_mvs_before_report’) and box and whisker plot statistics of the prephase characteristics of the MVs (‘box_whisker_stats).
keywords:
quasi-linear convective system; QLCS; tornado; radar; mesovortex; PERiLS; low-level rotation; tornadic; nontornadic; wind-damaging; Propagation, Evolution, and Rotation in Linear Storms; tornado warning; C-band On Wheels
published:
2018-07-29
Molloy, Erin K.; Warnow, Tandy
(2018)
This repository includes scripts, datasets, and supplementary materials for the study, "NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.
***When downloading datasets, please note that the following errors.***
In README.txt, lines 37 and 38 should read:
+ fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre
+ fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre
Note that the file names (fasttree-exon.tre and fasttree-intron.tre) are swapped.
In tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the "symmetric difference error rate" as the "Robinson-Foulds error rate". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.
In njmerge-supplementary-materials.pdf, the alpha parameter shown in Supplementary Table S2 is actually the divisor D, which is used to compute alpha for each gene as follows.
1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution.
2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).
Note that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.
keywords:
phylogenomics; species trees; incomplete lineage sorting; divide-and-conquer
published:
2025-11-19
Banerjee, Shivali; Beraja, Galit; Eilts, Kristen; Singh, Vijay
(2025)
:Bioenergy crops have been known for their ability to produce biofuels and bioproducts. In this study, the product portfolio of recently developed transgenic sugarcane (oilcane) bagasse has been redefined for recovering natural pigments (anthocyanins), sugars, and vegetative lipids. The total anthocyanin content in oilcane bagasse has been estimated as 92.9 ± 18.9 µg/g of dried bagasse with cyanidin-3-glucoside (13.5 ± 18.9 µg per g of dried bagasse) as the most prominent anthocyanin present. More than 85 % (w/w) of the total anthocyanins were recovered from oilcane bagasse at a pretreatment temperature of 150 °C for 15 min. These conditions for the hydrothermal pretreatment also led to a 2-fold increase in the glucose yield upon the enzymatic saccharification of the pretreated bagasse. Further, a 1.5-fold enrichment of the vegetative lipids was demonstrated in the pretreated residue. Re-defining green biorefineries with multiple high-value products in a zero-waste approach is the need of the hour for attaining sustainability.
keywords:
Conversion;Biomass Analytics;Bioproducts;Biorefinery;Oilcane
published:
2025-09-30
Kurambhatti, Chinmay V.; Kumar, Deepak; Singh, Vijay
(2025)
The coproduction of high-value anthocyanin extract in the cellulosic ethanol process would diversify the co-product market, increase revenue, and potentially improve the economics of the process. The high anthocyanin concentration in the cob and structural carbohydrates in residual stover make purple corn stover an attractive source for anthocyanin and ethanol coproduction. This study aimed to develop simulation models for processes integrating ethanol production and anthocyanin extraction using purple corn stover, to evaluate their techno-economic feasibility, and to compare their performance with the conventional ethanol production process using corn stover. The annual ethanol production for plants processing 2000 MT dry feedstock / day was 148.6 million L/year for the integrated processes compared with 222.6 million L/year for the conventional process. Anthocyanin production in the modified processes using dilute acid-based and water-based anthocyanin extraction processes was 1779 and 1099 MT/year, respectively. Capital investments for the integrated processes ($448.1 to $443.8 million) were higher than the conventional process ($371.9 million). Due to high revenue from anthocyanin extract, the ethanol production cost for the integrated process using acid-based anthocyanin extraction ($0.36/L) was 34.5% lower than conventional ethanol production ($0.55/L). The ethanol production cost for the integrated process using water-based anthocyanin extraction ($0.68/L) was higher than conventional ethanol production due to low ethanol and anthocyanin yields. The minimum ethanol selling price for the integrated process using acid-based anthocyanin extraction ($0.65/L) was also lower than the conventional process ($0.72/L), indicating an improvement in economic performance.
keywords:
Conversion;Economics;Feedstock Bioprocessing;Modeling
published:
2022-02-11
Lu, Yiyang; Bohn-Wippert, Kathrin; Pazerunas, Patrick J.; Moy, Jennifer M.; Singh, Harpal; Dar, Roy D.
(2022)
Upon treatment removal, spontaneous and random reactivation of latently infected T cells remains a major barrier toward curing HIV. Due to its stochastic nature, fluctuations in gene expression (or “noise”) can bias HIV reactivation from latency, and conventional drug screens for mean gene expression neglect compounds that modulate noise. Here we present a time-lapse fluorescence microscopy image set obtained from a Jurkat T-cell line, infected with a minimal HIV gene circuit, treated with 1,806 small molecule compounds, and imaged for 48 hours. In addition, the single-cell time-dependent reporter dynamics (single-cell gene expression intensity and noise trajectories) extracted from the image dataset are included. Based on this dataset, a total of 5 latency promoting agents of HIV was found through further experimentation in Lu et al., PNAS 2021 (doi: 10.1073/pnas.2012191118).
For a detailed description of the dataset, please refer to the readme file.
keywords:
HIV; latency; drug screen; fluorescence microscopy; time-lapse; microscopy; single-cell data; noise; gene expression fluctuation;
published:
2025-11-12
Purmessur, Cheeranjeev; Chow, Kaicheung; van Heck, Bernard; Kou, Angela
(2025)
This dataset contains all the raw and processed data used to generate the figures presented in the main text and the supplementary information of the paper "Operation of a high frequency, phase slip qubit." It also includes code for data analysis and code for generating the figures.
<b>Note:</b> V2 includes time domain analysis that also accounts for the thermal dephasing from the f state (see readme in Time domain Device A).
keywords:
phase slip qubit; superconducting qubit; quantum information; disordered superconductors
published:
2025-11-12
BAYSAL, CAN; Kausch, Albert P.; Cody, Jon P.; Altpeter, Fredy; Voytas, Daniel
(2025)
The requirement of in vitro tissue culture for the delivery of gene editing reagents limits the application of gene editing to commercially relevant varieties of many crop species. To overcome this bottleneck, plant RNA viruses have been deployed as versatile tools for in planta delivery of recombinant RNA. Viral delivery of single-guide RNAs (sgRNAs) to transgenic plants that stably express CRISPR-associated (Cas) endonuclease has been successfully used for targeted mutagenesis in several dicotyledonous and few monocotyledonous plants. Progress with this approach in monocotyledonous plants is limited so far by the availability of effective viral vectors. We engineered a set of foxtail mosaic virus (FoMV) and barley stripe mosaic virus (BSMV) vectors to deliver the fluorescent protein AmCyan to track viral infection and movement in Sorghum bicolor. We further used these viruses to deliver and express sgRNAs to Cas9 and Green Fluorescent Protein (GFP) expressing transgenic sorghum lines, targeting Phytoene desaturase (PDS), Magnesium-chelatase subunit I (MgCh), 4-hydroxy-3-methylbut-2-enyl diphosphate reductase, orthologs of maize Lemon white1 (Lw1) or GFP. The recombinant BSMV did neither infect sorghum nor deliver or express AmCyan and sgRNAs. In contrast, the recombinant FoMV systemically spread throughout sorghum plants and induced somatic mutations with frequencies reaching up to 60%. This mutagenesis led to visible phenotypic changes, demonstrating the potential of FoMV for in planta gene editing and functional genomics studies in sorghum.
keywords:
Feedstock Production;Genome Engineering;Genomics
published:
2022-04-15
Kim, Hyunbin; Makhnenko, Roman
(2022)
This dataset is provided to support the statements in Kim, H., and R.Y. Makhnenko. 2022. "Evaluation of CO2 sealing potential of heterogeneous Eau Claire shale". Journal of the Geological Society.
In geologic carbon dioxide (CO2) storage in deep saline aquifers, buoyant CO2 tends to float upwards in the reservoirs overlaid by low permeable formations called caprocks. Caprocks should serve as barriers to potential CO2 leakage that can happen through a diffusion loss and permeation through faults, fractures, or pore spaces. The leakage through intact caprock would mainly depend on its permeability and CO2 breakthrough pressure, and is affected by the heterogeneities in the material. Here, we study the sealing potential of a caprock from Illinois Basin - Eau Claire shale, with sandy and shaly fractions distinguished via electron microscopy and grain/pore size and surface area characterization. The direct measurements of permeability of sandy shale provides the values ~ 10-15 m2, while clayey specimens are three orders of magnitude less permeable. The CO2 breakthrough pressure under in-situ stress conditions is 0.1 MPa for the sandy shale and 0.4 MPa for the clayey counterpart – these values are higher than those predicted by the porosimetry methods performed on the unconfined specimens. Sandy Eau Claire shale would allow penetration of large CO2 volumes at low overpressures, while the clayey formation can serve as a caprock in the absence of faults and fractures in it.
keywords:
Geologic carbon storage; Caprock; Shale; CO2 breakthrough pressure; Porosimetry.
published:
2021-04-29
Jackson, Nicole ; Konar, Megan ; Debaere, Peter; Sheffield, Justin
(2021)
Global assessments of climate extremes typically do not account for the unique characteristics of individual crops. A consistent definition of the exposure of specific crops to extreme weather would enable agriculturally-relevant hazard quantification. We introduce the Agriculturally-Relevant Exposure to Shocks (ARES) model, a novel database of both the temperature and moisture extremes facing individual crops by explicitly accounting for crop characteristics. Specifically, we estimate crop-specific temperature and moisture shocks during the growing season for a 0.25-degree spatial grid and daily time scale from 1961-2014 globally for 17 crops.
The resulting database presented here provides annual crop- and event-specific exposure rates. Both gridded and country-level exposure rates are provided for each of the 17 crops. Our results provide new insights into the changes in the magnitude as well as spatial and temporal distribution of extreme events that impact crops over the past half-century. For additional information, please see the related paper by Jackson et al. (2021) in Environmental Research Letters.
keywords:
Crop-specific; weather extremes; temperature; moisture; global; gridded; time series
published:
2025-12-18
Boob, Aashutosh; Zhang, Changyi; Pan, Yuwei; Zaidi, Airah; Whitaker, Rachel; Zhao, Huimin
(2025)
Sulfolobus islandicus, an emerging archaeal model organism, offers unique advantages for metabolic engineering and synthetic biology applications owing to its ability to thrive in extreme environments. Although several genetic tools have been established for this organism, the lack of well-characterized chromosomal integration sites has limited its potential as a cellular factory. Here, we systematically identified and characterized 13 artificial CRISPR RNAs targeting eight integration sites in S. islandicus using the CRISPR-COPIES pipeline and a multi-omics-informed computational workflow. We leveraged the endogenous CRISPR-Cas system to integrate the reporter gene lacS and validated heterologous expression through a β-galactosidase assay, revealing significant positional effects. As a proof of concept, we utilized these sites to genetically manipulate lipid ether composition by overexpressing glycerol dibiphytanyl glycerol tetraether (GDGT) ring synthase B (GrsB). This study expands the genetic toolbox for S. islandicus and advances its potential as a robust platform for archaeal synthetic biology and industrial biotechnology.
keywords:
AI/ML; gene editing; genome engineering; metabolic engineering
published:
2024-04-11
Margenot, Andrew; Zhou, Shengnan; Xu, Suwei; Condron, Leo; Metson, Geneviève; Haygarth, Philip; Wade, Jordon; Agyeman, Price Chapman
(2024)
A defining feature of the Anthropocene is the distortion of the biosphere phosphorus (P) cycle. A relatively sudden acceleration of input fluxes without a concomitant increase in output fluxes has led to net accumulation of P in the terrestrial-aquatic continuum. Over the past century, P has been mined from geological deposits to produce crop fertilizers. When P inputs are not fully removed with harvest of crop biomass, the remaining P accumulates in soils. This residual P is a uniquely anthropogenic pool of P, and its management is critical for agronomic and environmental sustainability. This dataset includes data for us to quantify residual P from different long-term managed systems.
The following is the desccription of the dataset. There are 7 sheets in total.
1. P_balance: From Morrow Plots maize-maize rotaiton (1888-2021), L: Low estimation; M: medium estimation; H: high estimation;
2. M3P: From Morrow Plots selected plots (selected years), M3P_sur: Mehlich III P concentration in surface 17cm soils; M3P_sub: Mehlich III P concentration in 17-34cm subsoils; P_balance: the difference between P inputs and P outputs; TP_sur: total P stocks in surface 17cm soils; TP_sub: total P stocks in 17-34cm subsoils;
3. Morrow_Plot_P_pool_all: Group: a - labile P; b - Fe/Al-P; c - Ca-P; d - total organic P; e - non-extractable P; Fertilized: P stocks in the fertilized plot; Unfertilized: P stocks in the unfertilized plot; F-U: difference between P stocks in ther fertilized and unfertilized plots; dif%: percent difference in total P;
4. Rothamsted_P_pool_all: Treatment: Unfertilized: no fertilization; FYM: farmyard manure; PK: synthetic P and K fertilizer; Group: a - labile P; b - Fe/Al-P; c - Ca-P; d - total organic P; e - non-extractable P; P_change: differnce in P stocks over time; dif%: percent difference in total P;
5. L'Acadie_P_pool_all: Treatment: MP_LowP: moldboard plow with low rate of P fertilizer; MP_HighP: moldboard plow with high rate of P fertilizer; NT_LowP: no till with low rate of P fertilizer; NT_HighP: no till with high rate of P fertilizer; Group: a - labile P; b - Fe/Al-P; c - Ca-P; d - total organic P; e - non-extractable P; P_change: differnce in P stocks over time; dif%: percent difference in total P;
6. Rothamsted_P_pool_duration: Treatment: Unfertilized: no fertilization; FYM: farmyard manure; PK: synthetic P and K fertilizer; Duration: from a year to another year; Group: a - labile P; b - Fe/Al-P; c - Ca-P; d - total organic P; e - non-extractable P; P_change: differnce in P stocks over time; dif%: percent difference in total P;
7. L'Acadie_P_pool_duration: Treatment: MP_LowP: moldboard plow with low rate of P fertilizer; MP_HighP: moldboard plow with high rate of P fertilizer; NT_LowP: no till with low rate of P fertilizer; NT_HighP: no till with high rate of P fertilizer; Duration: from a year to another year; Group: a - labile P; b - Fe/Al-P; c - Ca-P; d - total organic P; e - non-extractable P; P_change: differnce in P stocks over time; dif%: percent difference in total P;
keywords:
phosphate rock; biosphere; balances; soil test P; long-term experiment
published:
2019-12-10
Yang, Pan; Zhao, Qiankun; Cai, Ximing
(2019)
The dataset consists of two types of data: the estimate of land productivity (the maximum productivity, MP) and the estimate of land that has low productivity for any major crops planted in the Contiguous United States and then may be available for growing bioenergy crops (the marginal land, ML). All data items are in GeoTiff format, under the World Geodetic System (WGS) 84 project, and with a resolution of 0.0020810045 degree (~250 m).
The MP values are calculated based on machine learning model estimated yields of major crops in the CONUS, and its expected value (MP_mean.tif), and associated uncertainty (MP_IDP.tif). The ML availability data have two versions: a deterministic version and a version with uncertainty. The deterministic MLs are determined as the land pixels with expected MP values falling in the range defined in the following criteria, and the MLs with uncertainty are determined as the probability that the MP value of a land pixel falls in the range defined in the following criteria:
Criteria_____Description
S1________ Current crop and pasture land with MP <= P50
S2________ Current crop and pasture land with MP <= P25
S3________ S1 + current grass and shrub land with P25 < MP < P50
S4________ S2 + current grass and shrub land with P10 < MP < P25
Economic__ Current crop and pasture land with potential profitability < 0
Here P10, P25 and P50 are the 10th, 25th and 50th percentile of crop MP values
keywords:
Land productivity;marginal land;land use
published:
2025-12-02
Cheng, Ming-Hsun; Maitra, Shraddha; Carr Clennon, Aidan N.; Appell, Michael; Dien, Bruce; Singh, Vijay
(2025)
The recalcitrance of lignocellulosic biomass necessitates an efficient pretreatment protocol for operating a successful cellulosic biorefinery. It is critical to improve cellulose accessibility for hydrolysis and fermentation by altering the plant cell wall’s physical structure and chemical composition. Sequential hydrothermal-mechanical refining pretreatment (HMR) allows efficient recovery of cellulosic sugars without utilizing any hazardous chemicals. HMR has been successfully applied to Liberty switchgrass, a bioenergy cultivar released by the USDA, and now it is being applied to oilcane, a recently developed transgenic sugarcane variety engineered to accumulate lipids in its vegetative tissues. Sugar yields of oilcane bagasse (OCB) and switchgrass (SG) treated with HMR are 96.4% and 75.4%, respectively. This study sought to correlate cellulosic sugar yields with structural changes within the cell wall caused by HMR on two distinct bioenergy crops. Simon’s staining technique for the specific surface area analysis showed that HMR increased the specific surface area of pretreated biomass residues by 80-112%. In addition, ATR-FTIR was performed to determine the effects of HMR on physical structures based on the total crystallinity index (TCI) and hydrogen bonding intensity (HBI). Irrespective of biomass type, HMR decreased the initial crystalline cellulose contents of untreated biomass residues by 3.5% and reduced TCI and HBI by 7-13%. The study found that sugar yields were negatively correlated to reducing values of hydrogen bonding intensity, crystalline cellulose content, and total crystallinity index.
keywords:
Conversion;Biomass Analytics;Economics;Hydrolysate
published:
2021-10-24
Tillman, Francis E.; Bakken, George S.; O'Keefe, Joy M.
(2021)
This dataset contains daily and hourly temperature measurements in twenty different bat box designs deployed in central Indiana, USA from May to September 2018. Daily and hourly environmental data (temperature, solar radiation, wind speed and direction) are also included for days and hours sampled. Bat box temperature data were reclassified to cool (</= 30°C), permissive (30.1–39.9°C), and stressful (>/= 40°C) categories according to known temperature tolerances of temperate-zone bats.
keywords:
bat box; design; environmental variables; microclimate; temperature
published:
2019-03-19
Fernandez, Roberto; Parker, Gary; Stark, Colin P.
(2019)
This dataset includes images and extracted centerlines from experiments looking at the formation and evolution of meltwater meandering channels on ice. The laboratory data includes centimeter- and millimeter-scale rivulets. Dataset also includes an image and corresponding centerlines from the Peterman Ice Island.
All centerlines were manually digitized in Matlab but no distributable code was developed for the process. Once digitized, centerlines were smoothed and standardized following methods and routines developed by other authors (Zolezzi and Guneralp, 2016; Guneralp and Rhoads, 2008). Details about the preparation of the centerlines and processing with these methods is included in the dissertation by Fernández (2018) linked to this dataset.
"Millimeter scale and Peterman Ice Island centerlines.pdf": This file includes the images of two mm-scale experimetns and the Peterman Ice Island image. Seventeen centerlines were digitized from the former and seven were digitized from the latter. Those centerlines are shown above the images themselves.
"Centimeter scale rivulet images.pdf": This file includes images corresponding to all cm-scale centerlines used for the analysis presented in the dissertation by Fernandez (2018). Each image has a short caption indicating the run ID and the time at which it was captured. The images were used to extract centerlines to look at the planform evolution of cm-scale meltwater meandering rivulets on ice. Images include 26 centerlines from four different runs.
"Meltwater meandering channel centerlines.xlsx": This spreadsheet contains the centerline data for all fifty centerlines. The workbook includes 51 sheets. The first 50 are related to each one of the channels. The mm scale and Peterman Ice Island ones are identified using the same IDs shown in "Millimeter scale and Peterman Ice Island centerlines.pdf". The cm-scale centerlines are identified by run ID and a number indicating the time in minutes (with t = 0 min being the time at which water started flowing over the ice block). The naming convention is also associated to the images in "Centimeter scale rivulet images.pdf". The last sheet in the workbook includes a summary of the channel widths measured from every image for each centerline. The 50 sheets with the centerline information have four columns each. The titles of the columns are X, Y, S, and C. X,Y are dimensionless coordinates of the centerline. S is dimensionless streamwise coordinate (location along the centerline). C is dimensionless curvature value. All these values were non-dimensionalized with the channel width. See Fernandez (2018), Zolezzi and Guneralp (2016), and Guneralp and Rhoads (2008) for more details regarding the process of smoothing, standardizing and non-dimensionalization of the centerline coordinates.
keywords:
Meltwater, Meandering, Ice, Supraglacial, Experiments
published:
2025-11-25
The diel activity of study animals while feeding at their kills in the Santa Cruz Mountains of California
keywords:
Santa Cruz
published:
2026-01-08
Dibaeinia, Payam; Sinha, Saurabh
(2026)
CoNSEPT is a tool to predict gene expression in various cis and trans contexts. Inputs to CoNSEPT are enhancer sequence, transcription factor levels in one or many trans conditions, TF motifs (PWMs), and any prior knowledge of TF-TF interactions.
keywords:
software; gene expression
published:
2025-08-20
Arshad, Muhammad Umer; Archer, David ; Wasonga, Daniel ; Namoi, Nictor; Boe, Arvid ; Rob , Mitchell; Heaton, Emily; Khanna, Madhu; Lee, DoKyoung
(2025)
The compiled datasets include detailed costs for switchgrass production, categorized into establishment, maintenance, and harvesting expenses, along with revenue calculations. Costs were gathered from multiple sources and adjusted for inflation, focusing on farm-gate profitability, excluding fixed costs and transportation. All financial data is provided per hectare. The dataset was used to evaluate the economic performance of forage- and bioenergy-type switchgrass cultivars and their response to nitrogen fertilization across diverse marginal environments in the U.S. Midwest. Data Envelopment Analysis (DEA) and cost-benefit analysis were employed to assess the efficiency and profitability of 23 different cultivar and fertilization rate combinations over five years.
published:
2019-03-22
Jones, Todd M.; Benson, Thomas J.; Ward, Michael P.
(2019)
This data publication provides example video clips related to research on association among flight ability of juvenile songbirds at fledging and juvenile morphological traits (wing emergence, wing length, body condition, mass, and tarsus length. File names reflect the species dropped in each video. These videos are supplemental material for scientific publications by the authors and reflect an example subset of all videos collected form 2017-2018 as part of a larger study on the post-fledging ecology of grassland and shrubland birds in east-Central Illinois, USA. No birds were harmed/injured in the production of these videos and procedures were approved by the Illinois Institutional Animal Care and Use Committee (IACUC), protocol no. 18221. Individuals depicted in the videos have given consent for the videos to be shared (talent/model release form; <a href="https://publicaffairs.illinois.edu/resources/release/">https://publicaffairs.illinois.edu/resources/release/</a>)
keywords:
songbirds; flight ability; wing development; wing length; wing emergence; nestling development; post-fledging
published:
2025-09-26
Arora, Amit; Singh, Vijay
(2025)
In this study, different process schemes were designed and evaluated for biodiesel production from engineered cane lipids with uncertain fatty acid compositions. Four different process schemes were compared under (i) thermal glycerolysis and (ii) enzymatic glycerolysis approaches. These schemes were based on the biodiesel yield and economic indicators such as the net present value (NPV) and the minimum selling price (MSP) of biodiesel. A scheme with polar lipid separation under thermal glycerolysis resulted in the maximum NPV ($96.5 million) and minimum MSP ($1107/ton biodiesel), respectively. Through local sensitivity analysis, it was concluded that the cane lipid percentage is the most significant factor influencing process economics. A conjoint analysis of the lipid procurement price and cane lipid percent suggested that 15% cane lipids with a low lipid procurement price ($0.536/kg) results in a positive NPV. When the cane lipid price is higher (>$0.80/kg), a 20% lipid content should be considered to achieve a positive NPV. At 20% cane lipids, the worst-case and best-case scenarios were evaluated by analyzing the interplay of the three most important parameters, The best-case scenario revealed that the minimum NPV under any process scheme could yield more than $100 million (or MSP: $0.80/L), and the worst-case analysis showed that losses incurred by the plant could be as high as $80 million (MSP: $1.36/L). A Monte Carlo simulation indicated that there is a 70% chance of the plant being profitable (NPV > 0).
keywords:
Conversion;Economics;Feedstock Bioprocessing;Modeling
published:
2024-03-01
Chen, Chu-Chun; Dominguez, Francina
(2024)
This dataset contains model output from the Community Earth System Model, Version 1 (CESM1; Hurrell et al., 2013) and variables from the European Centre for Medium-Range Weather Forecast (ECMWF) Reanalysis v5 (ERA5; Hersbach et al., 2020). These data were used for analysis in “The location of large-scale soil moisture anomalies affects moisture transport and precipitation over southeastern South America”, published in Geophysical Research Letters.
Acknowledgments:
This work was supported by NSF Award AGS-1852709. We acknowledge high-performance computing support from Cheyenne (doi:10.5065/D6RX99HX) provided by NCAR's Computational and Information Systems Laboratory, sponsored by the NSF. We thank Dr. Haiyan Teng for providing guidance on setting up the CESM experiments and offering valuable advice.
References:
Hersbach H, Bell B, Berrisford P, et al. The ERA5 global reanalysis. Q J R Meteorol Soc. 2020; 146: 1999–2049. https://doi.org/10.1002/qj.3803
Hurrell, J. W., and Coauthors, 2013: The Community Earth System Model: A Framework for Collaborative Research. Bull. Amer. Meteor. Soc., 94, 1339–1360, https://doi.org/10.1175/BAMS-D-12-00121.1
keywords:
atmospheric sciences; climate modeling; land-atmosphere interactions; soil moisture; regional atmospheric circulation; southeastern South America
published:
2025-10-21
Trieu, Anthony; Belaffif, Mohammad B.; Hirannaiah, Pradeepa; Manjunatha, Shilpa; Wood, Rebekah; Bathula, Yokshitha; Billingsley, Rebecca L.; Arpan, Anjali; Sacks, Erik; Clemente, Tom; Moose, Stephen; Reichert, Nancy A.; swaminathan, kankshita
(2025)
Miscanthus, a C4 member of the family Poaceae, is a promising perennial crop for bioenergy, renewable bioproducts, and carbon sequestration. Species of interest include nothospecies Miscanthus x giganteus and its parental species M. sacchariflorus and M. sinensis. Use of biotechnology-based procedures to genetically improve miscanthus, to date, have only included plant transformation procedures for introduction of exogenous genes into the host genome at random, non-targeted sites.
keywords:
Feedstock Production;Biomass Analytics;Genomics
published:
2022-06-20
Jiang, Ming; Dubnicek, Ryan; Worthey, Glen; Underwood, Ted; Downie, J. Stephen
(2022)
This is a sentence-level parallel corpus in support of research on OCR quality. The source data comes from: (1) Project Gutenberg for human-proofread "clean" sentences; and, (2) HathiTrust Digital Library for the paired sentences with OCR errors. In total, this corpus contains 167,079 sentence pairs from 189 sampled books in four domains (i.e., agriculture, fiction, social science, world war history) published from 1793 to 1984. There are 36,337 sentences that have two OCR views paired with each clean version. In addition to sentence texts, this corpus also provides the location (i.e., sentence and chapter index) of each sentence in its belonging Gutenberg volume.
keywords:
sentence-level parallel corpus; optical character recognition; OCR errors; Project Gutenberg; HathiTrust Digital Library; digital libraries; digital humanities;