Illinois Data Bank Dataset Search Results
Results
published:
2026-03-02
Lee, Jae Won; Bhagwat, Sarang; Kuanyshev, Nurzhan; Cho, Young; Sun, Liang; Lee, Ye-Gi; Cortes-Pena, Yoel; Li, Yalin; Rao, Christopher; Guest, Jeremy; Jin, Yong-Su
(2026)
Rising concerns for sustainability and global climate change have driven the development of sustainable production pathways for biofuels and chemicals from lignocellulosic biomass via integrated biological and chemical processes. We constructed an engineered Saccharomyces cerevisiae capable of producing 2,3-butanediol (2,3-BDO) from glucose without accumulating ethanol and glycerol, which hinder downstream processing of 2,3-BDO, through extensive metabolic reprogramming. Specifically, we introduced heterologous 2,3-BDO biosynthetic enzymes and deleted the major isozymes of ethanol and glycerol biosynthetic enzymes. In addition, we introduced an NAD+ regenerating Pyruvate-Malate (PM) cycle and enhanced the NAD+ regenerating capability of the PM cycle to resolve the redox imbalance from the deletion of ethanol and glycerol production pathways. The resulting engineered yeast produced 109.9 g/L of 2,3-BDO with a productivity of 1.0 g/L/h and a yield of 0.36 g/g glucose in a fed-batch fermentation. We also conducted techno-economic analysis (TEA) and life cycle assessment (LCA) of the production of methyl ethyl ketone (MEK) through catalytic dehydration of 2,3-BDO. A TEA based on the experimental results indicated that the minimum product selling price (MPSP) was estimated to be $1.90/kg. Regarding cradle-to-grave LCA, 100-year global warming potential (GWP100) and fossil energy consumption (FEC) were found to be 0.37 kg CO2 eq/kg and 3.1 MJ/kg, respectively. These results demonstrated the feasibility of cost-competitive and sustainable bio-based MEK production via yeast fermentation. In addition, we explored the possibility of using the fermentation broth containing 2,3-BDO as a biostimulant inducing drought tolerance in plants. As a result, the yeast 2,3-BDO fermentation broth can induce drought tolerance in Arabidopsis thaliana without a complicated purification process.
keywords:
Economics; Metabolomics
published:
2026-03-02
Session, Adam; Rokhsar, Daniel
(2026)
Hybridization brings together chromosome sets from two or more distinct progenitor species. Genome duplication associated with hybridization, or allopolyploidy, allows these chromosome sets to persist as distinct subgenomes during subsequent meioses. Here, we present a general method for identifying the subgenomes of a polyploid based on shared ancestry as revealed by the genomic distribution of repetitive elements that were active in the progenitors. This subgenome-enriched transposable element signal is intrinsic to the polyploid, allowing broader applicability than other approaches that depend on the availability of sequenced diploid relatives. We develop the statistical basis of the method, demonstrate its applicability in the well-studied cases of tobacco, cotton, and Brassica napus, and apply it to several cases: allotetraploid cyprinids, allohexaploid false flax, and allooctoploid strawberry. These analyses provide insight into the origins of these polyploids, revise the subgenome identities of strawberry, and provide perspective on subgenome dominance in higher polyploids.
keywords:
Genomics
published:
2026-03-02
Wickland, Daniel; Borges dos Santos, Lucas; Hudson, Karen; Hudson, Matthew
(2026)
Height is a critical component of plant architecture, significantly affecting crop yield. The genetic basis of this trait in soybean remains unclear. In this study, we report the characterization of the Compact mutant of soybean, which has short internodes. The candidate gene was mapped to chromosome 17, and the interval containing the causative mutation was further delineated using biparental mapping. Whole-genome sequencing of the mutant revealed an 8.7 kb deletion in the promoter of the Glyma.17g145200 gene, which encodes a member of the class III gibberellin (GA) 2-oxidases. The mutation has a dominant effect, likely via increased expression of the GA 2-oxidase transcript observed in green tissue, as a result of the deletion in the promoter of Glyma.17g145200. We further demonstrate that levels of GA precursors are altered in the Compact mutant, supporting a role in GA metabolism, and that the mutant phenotype can be rescued with exogenous GA3. We also determined that overexpression of Glyma.17g145200 in Arabidopsis results in dwarfed plants. Thus, gain of promoter activity in the Compact mutant leads to a short internode phenotype in soybean through altered metabolism of gibberellin precursors. These results provide an example of how structural variation can control an important crop trait and a role for Glyma.17g145200 in soybean architecture, with potential implications for increasing crop yield.
keywords:
Biomass Analytics; Genomics
published:
2026-03-02
Yang, Jihoon; Sooksa-nguan, Thanwalee (JiJY); Kannan, Baskaran; Cano-Alfanar, Sofia; Liu, Hui; Kent, Angela; Shanklin, John; Altpeter, Fredy; Howe, Adina
(2026)
This project aims to study the microbial structure and potential functions of bacterial and fungal microbiomes in leaves, stems, roots, rhizospheres, and bulk soils of energy crops (oilcane) grown in greenhouses.
keywords:
Biomass Analytics; Metabolomics
published:
2026-03-02
Mula-Michel, Himaya; White, Paul; Hale, Anna
(2026)
Saccharum yield decline results from long-term monoculture practices. Changes in cropping management can improve soil health and productivity. Below-ground bacterial community diversity and composition across soybean (Glycine max (L.) Merr) cover crop, Saccharum monoculture (30+ year) and fallowed soil were determined. Near full length (~1,400 base pairs) of 16S rRNA gene sequences were extracted from the rhizospheres of sugarcane and soybean and fallowed soil were compared. Higher soil bacterial diversity was observed in the soybean cover crop than sugarcane monoculture across all measured indices (observed operationational taxonomic units, Chao1, Shannon, reciprocal Simpson and Jackknife). Acidocateria, Proteobacteria, Bacteroidetes and Planctomycetes were the most abundant bacterial phyla across the treatments. Indicator species analysis identified nine indicator phyla. Planctomycetes, Armatimonadetes and candidate phylum FBP were associated with soybean; Proteobacteria and Firmicutes were linked with sugarcane and Gemmatimonadetes, Nitrospirae, Rokubacteria and unclassified bacteria were associated with fallowed soil. Non-metric multidimensional scaling analysis showed distinct groupings of bacterial operational taxonomic units (97% identity) according to management system (soybean, sugarcane or fallow) indicating compositional differences among treatments. This is confirmed by the results of the multi-response permutation procedures (A = 0.541, p = 0.00045716). No correlation between soil parameters and bacterial community structure was observed according to Mantel test (r = 211865, p = 0.14). Use of soybean cover-crop fostered bacterial diversity and altered community structure. This indicates cover crops could have a restorative effect and potentially promote sustainability in long-term Saccharum production systems.
keywords:
Field Data; Genomics
published:
2026-02-27
Zhang, Zhihai; Anwar, Sultana; Yafuso, Erin; Zuniga Soto, Evelyn; Luo, Guangbin; Moose, Stephen; swaminathan, kankshita; Altpeter, Fredy; Hudson, Matthew
(2026)
A new GAL4-based feed-forward loop circuit enhances β-glucuronidase (GUS) reporter gene expression in leaves and stems of stably transformed sugarcane plants.
keywords:
Bioproducts; Metabolic Engineering; Plant Transformation; Sugarcane
published:
2026-03-01
Sundararajan, Sumashini; Chamoli, Gauranshi; Dalling, James; Krishnadas, Meghna
(2026)
This dataset contains seed germination data from two inoculation experiments involving two fig species, Ficus beddomei and Ficus callosa, found in the tropical forests of the Western Ghats, India, and fungal taxa that were isolated from them. The file "first_inoculation_expt_Nov_2025" contains germination data for screening of select fungal taxa for their effects on the two fig species. The file "serial_inoculation_expt_Nov_2025" contains germination data from a serial inoculation experiment involving successive inoculation of seeds with an endophytic followed by a pathogenic fungal taxon.
keywords:
Ficus; seeds; fungi; germination; endophyte; pathogen
published:
2026-03-01
Edmonds, Devin A.; Fanomezantsoa, Rebecca E.; Rabibisoa, Nirhy H. C.; Roberts, Sam H.
(2026)
This dataset contains ecological and demographic data for William’s bright‑eyed frog (Boophis williamsi), a critically endangered amphibian restricted to the Ankaratra Massif in Madagascar’s central highlands. Field surveys were conducted between September 2018 – March 2019 and July 2021 across ten 100‑m stream transects to estimate abundance and identify habitat associations for both tadpoles and adult frogs. Data include repeated counts of individuals and associated habitat variables (e.g., canopy cover, substrate type, stream depth, discharge, and temperature). Abundance was estimated using N‑mixture models implemented in R (version 4.3.1) with the ubms package, with separate models for tadpoles and frogs to account for differences in detection probability. The dataset consists of multiple CSV files capturing microhabitat, environmental variables, and raw survey count data (y_frogs.csv and y_tadpoles.csv) and an R script (boophis_abundance.R) used for model fitting. The dataset was compiled for an article accepted in the Herpetological Journal by the British Herpetological Society and is intended to support long‑term monitoring and conservation planning for B. williamsi and other threatened amphibians in Madagascar.
keywords:
amphibian conservation; biodiversity conservation; detection probability; endangered species; N-mixture model
published:
2024-07-29
Caetano Machado Lopes, Lorran; Chacko, George
(2024)
This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15.
The dataset comprises two compressed (.xz) files.
1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types:
• openalex_id: A unique identifier from the Open Alex catalog.
• integer_id: An integer representing the new identifier (assigned by the authors)
• hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes).
2) filename: citation_table.tsv.xz
This edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively.
Summary Features
• Total Nodes (Documents): 256,997,006
• Total Edges (citations): 2,148,871,058
• Documents with DOIs: 163,495,446
• Edges between documents with DOIs: 1,936,722,541 [corrected to 2,148,788,148 edges Nov 13, 2025]
• Count of unique nodes in edgelist 111,453,719 [updated Nov 13, 2025]
Note: Nov 13, 2025. An improved curation process will be applied to a future version of this dataset
Note: Nov 13, 2025.
The code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/
keywords:
citation networks; Open Alex
published:
2021-05-17
Wuebbles, D; Angel, J; Petersen, K; Lemke, A.M.
(2021)
Please cite as: Wuebbles, D., J. Angel, K. Petersen, and A.M. Lemke, (Eds.), 2021: An Assessment of the Impacts of Climate Change in Illinois. The Nature Conservancy, Illinois, USA. https://doi.org/10.13012/B2IDB-1260194_V1
Climate change is a major environmental challenge that is likely to affect many aspects of life in Illinois, ranging from human and environmental health to the economy. Illinois is already experiencing impacts from the changing climate and, as climate change progresses and temperatures continue to rise, these impacts are expected to increase over time. This assessment takes an in-depth look at how the climate is changing now in Illinois, and how it is projected to change in the future, to provide greater clarity on how climate change could affect urban and rural communities in the state. Beyond providing an overview of anticipated climate changes, the report explores predicted effects on hydrology, agriculture, human health, and native ecosystems.
keywords:
Climate change; Illinois; Public health; Agriculture; Environment; Water; Hydrology; Ecosystems
published:
2025-08-17
These codes implement the master equation microkinetic modeling (ME-MKM) calculations of Adams et al. (J. Phys. Chem. C 2025, 129, 15, 7285–7294), as well as the automatic derivatives for activation energies and reaction orders in their follow-up work (in review).
keywords:
Microkinetic model; master equation; periodic tiling; catalysis; adsorption;
published:
2025-06-03
White, Andrew; Lambert, John
(2025)
GIS data and geoprocessing tools associated with White and Lambert (2025) modeling paper that assesses the potential impact of development on the archaeological resources of Illinois.
keywords:
development; archaeology; climate change; GIS
published:
2026-02-25
Bayer, Hugo; Binette , Annalise; Sweck, Samantha; Juliano, Vitor; Plas, Samantha; Ferst, Lara; Hassell Jr, James; Maren, Stephen
(2026)
Raw data from the article "Locus Coeruleus-Amygdala Circuit Disrupts Prefrontal Control to Impair Fear Extinction", which is accepted for publication in PNAS.
keywords:
Basolateral Amygdala; Fear conditioning; Infralimbic cortex; Learning and Memory; Norepinephrine
published:
2026-02-10
Ejiogu, Emmanuel; Peters, Baron
(2026)
This dataset contains the jupyter notebook and microsoft excel data used to reproduce the results from the eponymous paper.
1. "pourahmady data.xlsx" contains NMR data for triad and dyad sequences in a PVC/Polyethylene copolymer.
V is a vinyl chloride segment (-CH2CHCl-) and E is an ethylene segment (-CH2CH2-)
VE is the dyad -CH2CHCl-CH2CH2-
VC_frac_1 = fraction of vinyl chloride segments obtained from 13C-NMR
VC_frac_2 = fraction of vinyl chloride segments obtained from elemental analysis
2. "Triad_Kinetics.ipynb" contains code that fit data from "pourahmady data.xlsx"
published:
2024-12-11
MMAudio pretrained models. These models can be used in the open-sourced codebase https://github.com/hkchengrex/MMAudio
<b>Note:</b> mmaudio_large_44k_v2.pth and Readme.txt are added to this V2. Other 4 files stay the same.
published:
2022-03-25
Kudeki, Erhan; Reyes, Pablo
(2022)
Ground based radar data sets collected during the 2013 NASA EVEX Campaign conducted in Roi-Namur island of the Kwajalein Atoll in the Republic of Marshall Islands are deposited in this databank. Radar data were collected with IRIS VHF and ALTAIR VHF/UHF systems.
published:
2026-02-20
Emran, Shah-Al; Petersen, Bryan M; Roney, Heather Elizabeth ; Masters, Michael David ; Varela, Sebastian; Hedrick, Travis; Leakey, Andrew D.B. ; VanLoocke, Andy; Heaton, Emily A.
(2026)
This dataset contains biomass yield measurements and associated vegetation index data collected from commercial Miscanthus × giganteus fields in eastern Iowa during the 2022–2023 growing seasons.
The data support the analyses presented in the article:
“Yield From Iowa's First Commercial Miscanthus Fields: Implications of Spatial Variability for Productivity and Sustainability Beyond Research Plots.”
We collected 105 ground-truth biomass samples from four mature commercial fields (>4 years old) covering 92.81 ha.
Samples were taken from 3 m² quadrats that were hand-harvested in alignment with commercial harvest timing. Stem biomass (excluding leaves) was weighed, moisture-corrected, and converted to dry-matter yield expressed in Mg DM ha⁻¹.
Sampling locations were selected to capture spatial variability visible in aerial imagery and were recorded using RTK GPS.
Each biomass observation was paired with vegetation indices derived from high-resolution PlanetScope satellite imagery (3 m resolution).
Images were acquired throughout the growing season, and indices were calculated to evaluate their ability to predict end-of-season biomass yield.
Statistical and machine learning approaches were used to identify key predictors, and a linear regression model based on end-of-July Green Normalized Difference Vegetation Index (GNDVI) was developed and evaluated.
This repository includes the data used in that modeling workflow. Management practices, economic data, full imagery time series, and additional methodological details are described in the associated publication and are not included here.
The dataset consists of three comma-separated value (CSV) files:
1. Combine_Groundtruth_Yield_VI_22_23.csv
This file contains ground-truth biomass yield measurements and associated key vegetation index values collected during the 2022 and 2023 growing seasons.
Rows: 105 observations
Columns:
Year — Year of observation (2022 or 2023)
Field — Field location identifier
Sample_number — Unique sample identifier
GNDVI_End_Jul — Green Normalized Difference Vegetation Index calculated at end of July
GNDVI_End_Aug — Green Normalized Difference Vegetation Index calculated at end of August
NDRE_End_Aug — Normalized Difference Red Edge index calculated at end of August
Biomass_Stem_Yield_MgDM/ha — Measured stem biomass yield (megagrams dry matter per hectare)
2. trainData_GNDVI.csv
This file contains the subset of observations used to train the predictive relationship between July GNDVI and biomass yield.
Rows: 76 observations
Columns:
Unnamed: 0 — Row index retained from the original data processing workflow
GNDVI_End_Jul — GNDVI at end of July
Stem_Yield_MgDM/ha — Observed stem biomass yield (Mg DM ha⁻¹)
3. testData_GNDVI.csv
This file contains the test dataset used to evaluate model performance.
Rows: 29 observations
Columns:
Unnamed: 0 — Row index retained from the original data processing workflow
GNDVI_End_Jul — GNDVI at end of July
Predicted_Yield_MgDM/ha — Model-predicted stem biomass yield (Mg DM ha⁻¹)
Observed_Yield_MgDM/ha — Measured stem biomass yield (Mg DM ha⁻¹)
keywords:
Potential yield, yield gap, in-field management, yield prediction, remote sensing, spatial variability, profitability, Miscanthus × giganteus, M×g
published:
2026-02-19
Gurumoorthi, Akshay; Peters, Baron
(2026)
The dataset contains a jupyter notebook intended for anyone who wants to apply the Empirical Bayes method described in the paper titled 'Data for Improving individual committor estimates and data efficiency in reaction coordinate tests with the Empirical Bayes method' to committor data with a simple and lucid python script.
published:
2026-02-18
Ward, Michael; Slayton, Sarah
(2026)
The datasets are associated with a paper "The Windy City rookery: Movement and activity patterns of Black-crowned Night Herons (Nycticorax nycticorax) in a human-dominated landscape" that will soon be published in the journal "Ecology and Evolution". These are data associated with the movements, behaviors, and morphology of black-crowned night herons
keywords:
black-crowned night heron; urban ecology; avian movement
published:
2026-02-11
Sponzilli, Ryan; Looney, Leslie
(2026)
Data for the publication Protostellar Outflows Shed Light on the Dominant Close Companion Star Formation Pathways (Sponzilli et al). Contains the fits files, data files, and python scripts. The entire analysis is containerized with Docker. The `Dockerfile` in the root folder can be used to build the image.
<b>Note:</b> __MACOSX folder or files starting with dot can be safely ignored or removed.
keywords:
Protobinaries; ALMA; FITS; 12CO imaging of outflows in Perseus and Orion
published:
2026-02-17
Peyton, Buddy; Bajjalieh, Joseph; Martin, Michael; Gerald, Andrea
(2026)
Coups d'Ètat are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019, Chin, Carter and Wright 2021). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d’État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e., realized, unrealized, or conspiracy) the type of actor(s) who initiated the coup (i.e., military, rebels, etc.), as well as the fate of the deposed leader.
Version 2.2.2 corrects an error in version 2.2.1 in which the “conspiracy” designation was mistakenly assigned to coup_id: 40411262025. Version 2.2.2 resolves this issue by removing the incorrect designation.
Version 2.2.1 adds 67 additional coup events. 47 of these came from examining the Colpus dataset (Chin, Carter, and Wright 2021), and 20 of these events were added to the data set in the normal annual review of potential new coup events. This version also updates the coding to events in Mali in 2012, Serbia in 2000 and Chad in 1979.
Version 2.2.0 adds 94 additional coup events. 66 of these came from examining Powell and Thyne’s “discarded” events and 28 of these events were added to the data set in the normal annual review of potential new coup events. This version also updates the coding to events in Brazil in 1945 and the Congo in 1968.
Version 2.1.3 adds 19 additional coup events to the data set, corrects the date of a coup in Tunisia, and reclassifies an attempted coup in Brazil in December 2022 as a conspiracy.
Version 2.1.2 added 6 additional coup events that occurred in 2022 and updated the coding of an attempted coup event in Kazakhstan in January 2022.
Version 2.1.1 corrected a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixed this omission by marking the case as both a dissident coup and an auto-coup.
Version 2.1.0 added 36 cases to the data set and removed two cases from the v2.0.0 data set. This update also added actor coding for 46 coup events and added executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event.
Version 2.0.0 improved several aspects of the previous version (v1.0.0) and incorporated additional source material to include:
• Reconciling missing event data
• Removing events with irreconcilable event dates
• Removing events with insufficient sourcing (each event needs at least two sources)
• Removing events that were inaccurately coded as coup events
• Removing variables that fell below the threshold of inter-coder reliability required by the project
• Removing the spreadsheet ‘CoupInventory.xls’ because of inadequate attribution and citations in the event summaries
• Extending the period covered from 1945-2005 to 1945-2019
• Adding events from Powell and Thyne’s Coup Data (Powell and Thyne, 2011)
Version 1.0.0 was released in 2013. This version consolidated coup data taken from the following sources:
• The Center for Systemic Peace (Marshall and Marshall, 2007)
• The World Handbook of Political and Social Indicators (Taylor and Jodice, 1983)
• Coup d’Ètat: A Practical Handbook (Luttwak, 1979)
• The Cline Center’s Social, Political and Economic Event Database (SPEED) Project (Nardulli, Althaus and Hayes, 2015)
• Government Change in Authoritarian Regimes – 2010 Update (Svolik and Akcinaroglu, 2006)
<br>
<b>Items in this Dataset</b>
1. <i>Cline Center Coup d'État Codebook v.2.2.2 Codebook.pdf</i> - This 18-page document describes the Cline Center Coup d’État Project dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d'État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised February 2026</i>
2. <i>Coup Data 2.2.2.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 1,161 observations. <i>Revised February 2026</i>
3. <i>Source Document v2.2.2.pdf</i> - This 365-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised February 2026</i>
4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in Markdown language. <i>Revised February 2026</i>
<br>
<b> Citation Guidelines</b>
1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation:
Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2026. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.2.2. February 17. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V10
2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access):
Peyton, Buddy, Joseph Bajjalieh, Michael Martin, and Andrea Gerald. 2026. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.2.2. February 17. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V10
published:
2026-02-17
Nie, Ke; Bradford, J. Nofear; Mandal, Supriya; Bista, Aayam; Pfaff, Wolfgang; Kou, Angela
(2026)
This dataset contains all the raw and processed data used to generate the figures presented in the main text and the appendix of the paper "Fluxonium as a control qubit for bosonic quantum information". It also includes code for data analysis and figure generation.
keywords:
superconducting qubit; fluxonium; bosonic control; quantum information
published:
2026-02-13
Frederick, Samuel; Mohebalhojeh, Matin; Curtis, Jeffrey; West, Matthew; Riemer, Nicole
(2026)
This dateset contains data files necessary to replicate figures from "Idealized Particle-Resolved Large-Eddy Simulations to Evaluate the Impact of Emissions Spatial Heterogeneity on CCN Activity" submitted to Atmospheric Chemistry and Physics.
Within the compressed folder data.zip are two subdirectories, "processed_data" and "spatial-het". The "processed_data" directory contains netCDF files which contain a subset of simulation output used in figure generation. The "spatial-het" subdirectory contains a .csv file with spatial heterogeneity values computed via an exact algorithm of the spatial heterogeneity metric described by Mohebalhojeh et al. 2025. The subdirectory "sh-patterns" contains .csv files for each emissions scenario. Each entry corresponds to a single grid cell over a domain of dimension 100x100 (lateral resolution of the computational domain employed in this paper).
Within scripts.zip are python notebooks for generating figures. Additional python modules are included which contain helper functions for notebooks. Furthermore, a Fortran version of the spatial heterogeneity metric is included alongside shells scripts for creating a python environment in which the code can be compiled and convert into a Python module. Note that the create_env.sh and compile_nsh.sh scripts must be run prior to executing cells in notebooks to make use of the spatial heterogeneity subroutines.
<b>*Note*:</b> New in this V3: During review, a bug regarding vertical diffusion of particles was discovered in WRF-PartMC which necessitated re-running simulations. We present new simulations with diffusion fixed. Furthermore, we have run additional simulations in response to reviewer comments--simulations with emissions turned off at t = 4 h to investigate reversible partitioning and simulations with the RH raised near saturation throughout the domain to model the effects of co-condensation. The README PDF has been updated to reflect changes to the dataset collection. Also, we have added a shell script in scripts_v3.zip which was used to process simulation output and create the data subsets contained in data_v3.zip. Lastly, notebooks were re-run with updated datasets to create manuscript figures and additional plotting routines were added for new figures pertaining to the requested simulations.
keywords:
Atmospheric chemistry; aerosols; Particle-resolved modeling; spatial heterogeneity
published:
2026-02-11
Kim, Hyunhwa; Purba, Denissa Sari Darmawi; Kontou, Eleftheria
(2026)
The dataset and code enable replication of the case study in Section 6 titled "California wildfire energy supply logistics" of the Transportation Research Part E: Logistics and Transportation Review published paper "Bidirectional Energy Supply Logistics Using Uncrewed Electric Aerial and Ground Vehicles: A Two-Echelon Location-Routing Problem with Resource-Constrained Demand Allocation and Time Windows."
keywords:
electric vehicle; energy supply logistics; location-routing problem; bidirectional energy; uncrewed aerial vehicle
published:
2026-02-11
Hanley, David; Lee, Jongwon; Choi, Su Yeon; Bretl, Timothy
(2026)
If you use this dataset, please cite both the dataset and the associated data paper (bibtex is below).
@ARTICLE{11386847,
author={Hanley, David and Lee, Jongwon and Choi, Su Yeon and Bretl, Timothy},
journal={IEEE Transactions on Instrumentation and Measurement},
title={The MagPIE2 Dataset for Mapping, Localization, and Simultaneous Localization and Mapping Using Magnetic Fields},
year={2026},
volume={},
number={},
pages={1-1},
keywords={Magnetometers;Magnetic field measurement;Magnetic fields;Pedestrians;Location awareness;Buildings;Simultaneous localization and mapping;Measurement errors;Hardware;Calibration;Localization;mapping;SLAM;dataset;benchmark;magnetometer;magnetic field},
doi={10.1109/TIM.2026.3662919}}
We present a dataset for the evaluation of magnetic field-based robotic and pedestrian localization, mapping, and SLAM methods. This dataset contains magnetometer and inertial measurement unit data collected from inside three buildings both a pedestrian and a ground robot. Data were collected at different heights simultaneously, both with and without changes in the placement of objects that may affect magnetometer measurements. In total, approximately 689 square meters of floor space was covered by this dataset.
This dataset is archivally stored. We provide a GitHub site which is meant to serve as a forum to post issues with the dataset, share code using the dataset, and to resolve problems: <a href="https://github.com/hanley6/MagPIE2Forum">https://github.com/hanley6/MagPIE2Forum</a>
Note that while the dataset is meant to be permanently stored, this forum is not meant to guarantee perennial support and its existence will be dependent on the policies of GitHub.
<b>How is the dataset organized?</b> The data is divided into the following parts at a high level and more detailed information can be found in the Readme:
1. The walking portion of the dataset: CSL_WLK.zip, DCL_WLK.zip, Talbot_WLK.zip, and WLK_Misc.zip.
2. The robot portion of the dataset: Robot_Dataset.zip.
3. Motor interference tests: Motor_Interference_Test.zip.
4. Ground truth evaluation: Ground_Truth_Evaluation.zip.
5. Quick start results: Quick_Start_Results.zip.
<b>How is data recorded and stored?</b> Data is generally collected in the form of ROS bag files. Each ROS bag has Intel Realsense camera images, magnetometer readings, IMU readings, timestamps, and more as applicable for each file in the dataset. Each bag file has an associated metadata file written as a YAML file. This contains general information about each bag file including the start and stop time, who collected the bag file (during the pedestrian portion of the dataset), and the approximate location where data was collected. In several cases, additional comma separated (csv) files of the dataset where included either as a convenient supplement to ROS bag files (e.g., csv files of magnetometer calibration data) or because they serve as human readable quick start results.
<b>How does one set up and run files on the dataset?</b> The files are stored in ROS bags and are, therefore, meant to be run using the Robot Operating System. Information regarding how to use the Robot Operating System as well as installation instructions are available at: <a href="https://ros.org/">https://ros.org/</a>
keywords:
Localization; mapping; SLAM; dataset; benchmark; magnetometer; magnetic field