Illinois Data Bank Dataset Search Results
Results
published:
2021-08-04
Sabrina, Sadia; Lewis, Quinn; Rhoads, Bruce
(2021)
This dataset contains data derived from large-scale particle velocimetry measurements obtained at the confluence of the Saline Branch and an unnamed tributary in Illinois. The data were collected using two cameras positioned about the confluence, one mounted on a cable and the other mounted on a tripod. A description of the content of the files can be found in Description of Files.rtf.
keywords:
confluence; hydrodynamics; LSPIV; flow structure; stagnation
published:
2022-07-10
Winogradoff, David; Chou, Han-Yi; Maffeo, Christopher; Aksimentiev, Aleksei
(2022)
keywords:
Nuclear pore complex; system files; trajectory files
published:
2025-03-13
ALMA Band 4 and 7 observations of the dust continuum in the Class 0 protostellar system L1448 IRS3B. We include the selfcal script, imaging scripts, fits files, and the python scripts for the figures in the paper.
keywords:
ALMA; Band 4; Band 6; polarization; L1448 IRS3B
published:
2025-03-19
Bieri, Carolina A.; Dominguez, Francina; Miguez-Macho, Gonzalo; Fan, Ying
(2025)
This repository includes HRLDAS Noah-MP model output generated as part of Bieri et al. (2025) - Implementing deep soil and dynamic root uptake in Noah-MP (v4.5): Impact on Amazon dry-season transpiration.
These data are distributed in two different formats: Raw model output files and subsetted files that include data for a specific variable. All files are .nc format (NetCDF) and aggregated into .tar files to facilitate download. Given the size of these datasets, Globus transfer is the best way to download them.
Raw model output for four model experiments is available: FD (control), GW, SOIL, and ROOT. See the associated publication for information on the different experiments. These data span an approximately 20 year period from 01 Jun 2000 to 31 Dec 2019. The data have a spatial resolution of 4 km and a temporal frequency of 3 hours. These data are for a domain in the southern Amazon basin (see Figure 1 in the associated publication). Data for each experiment is available as a .tar file which includes 3-hourly NetCDF files. All default Noah-MP output variables are included in each file. As a result, the .tar files are quite large and may take many hours or even days to transfer depending on your network speed and local configurations. These files are named 'noahmp_output_2000_2019_EXP.tar', where EXP is the name of the experiment (FD, GW, SOIL, or ROOT).
Subsetted model output at a daily temporal resolution for all four model experiments is also available. These .tar files include the following variables: water table depth (ZWT), latent heat flux (LH), sensible heat flux (HFX), soil moisture (SOIL_M), canopy evaporation (ECAN), ground evaporation (EDIR), transpiration (ETRAN), rainfall rate at the surface (QRAIN), and two variables that are specific to the ROOT experiment: ROOTACTIVITY (root activity function) and GWRD (active root water uptake depth). There is one file for each variable within the tarred files. These files are named 'noahmp_output_subset_2000_2019_EXP.tar', where EXP is the name of the experiment (FD, GW, SOIL, or ROOT).
Finally, there is a sample dataset with raw 3-hourly output from the ROOT experiment for one day. The purpose of this sample dataset is to allow users to confirm if these data meet their needs before initiating a full transfer via Globus. This file is named 'noahmp_output_sample_ROOT.tar'.
The README.txt file provides information on the Noah-MP output variables in these datasets, among other specifications.
Information on HRLDAS Noah-MP and names/definitions of model output variables that are useful in working with these data are available here: http://dx.doi.org/10.5065/ew8g-yr95. Note that some output variables may be listed in this document under a different variable name, so searching for the long name (e.g. 'baseflow' instead of 'QRF') is recommended.
Information on additional output variables that were added to the model as part of this study is available here: https://github.com/bieri2/bieri-et-al-2025-EGU-GMD/tree/DynaRoot.
Model code, configuration files, and forcing data used to carry out the model simulations are linked in the related resources section.
keywords:
Land surface model; NetCDF
published:
2022-08-29
Winogradoff, David; Chou, Han-Yi; Maffeo, Christopher; Aksimentiev, Aleksei
(2022)
Example scripts and configuration files needed to perform select simulations described in the manuscript "Percolation transition prescribes protein size-specific barrier to passive transport through the nuclear pore complex."
keywords:
Nuclear Pore Complex; simulation setup
published:
2021-02-01
These datasets provide the basis of our analysis in the paper - The Potential Impact of a Clean Energy Society On Air Quality. All datasets here are from the model output (CAM4-chem). All the simulations were run to steady-state and only the outputs used in the analysis are archived here.
keywords:
clean energy; ozone; particulates
published:
2021-10-13
Lyu, Fangzheng; Xu, Zewei; Ma, Xinlin; Wang, Shaohua; Li, Zhiyu; Wang, Shaowen
(2021)
Drainage network analysis is fundamental to understanding the characteristics of surface hydrology. Based on elevation data, drainage network analysis is often used to extract key hydrological features like drainage networks and streamlines. Limited by raster-based data models, conventional drainage network algorithms typically allow water to flow in 4 or 8 directions (surrounding grids) from a raster grid. To resolve this limitation, this paper describes a new vector-based method for drainage network analysis that allows water to flow in any direction around each location. The method is enabled by rapid advances in Light Detection and Ranging (LiDAR) remote sensing and high-performance computing. The drainage network analysis is conducted using a high-density point cloud instead of Digital Elevation Models (DEMs) at coarse resolutions. Our computational experiments show that the vector-based method can better capture water flows without limiting the number of directions due to imprecise DEMs. Our case study applies the method to Rowan County watershed, North Carolina in the US. After comparing the drainage networks and streamlines detected with corresponding reference data from US Geological Survey generated from the Geonet software, we find that the new method performs well in capturing the characteristics of water flows on landscape surfaces in order to form an accurate drainage network.
This dataset contains all the code, notebooks, datasets used in the study conducted for the research publication titled " A Vector-Based Method for Drainage Network Analysis Based on LiDAR Data ".
## What's Inside
A quick explanation of the components
* `A Vector Approach to Drainage Network Analysis Based on LiDAR Data.ipynb` is a notebook for finding the drainage network based on LiDAR data
*`Picture1.png` is a picture representing the pseudocode of our new algorithm
* HPC` folder contains codes for running the algorithm with sbatch in HPC
** `execute.sh` is a bash script file that use sbatch to conduct large scale analysis for the algorithm
** `run.sh` is a bash script file that calls the script file `execute.sh` for large scale calculation for the algorithm
** `run.py` includes the codes implemented for the algorithm
* `Rowan Creek Data` includes data that are used in the study
** `3_1.las` and `3_2.las ` are the LiDAR data files that is used in our analysis presented in the paper. Users may use this data file to reproduce our results and may replace it with their own LiDAR file to run this method over different areas
** `reference` folder includes reference data from USGS
*** `reference_3_1.tif` and `reference_3_2.tif` are reference data for the drainage system analysis retrieved from USGS.
keywords:
CyberGIS; Drainage System Analysis; LiDAR
published:
2022-06-22
Kang, Jeon-Young; Farkhad, Bita Fayaz; Chan, Man-pui Sally; Michels, Alexander; Albarracin, Dolores; Wang, Shaowen
(2022)
This dataset helps to investigate the Spatial Accessibility to HIV Testing, Treatment, and Prevention Services in Illinois and Chicago, USA.
The main components are: population data, healthcare data, GTFS feeds, and road network data. The core components are:
1) `GTFS` which contains GTFS (<a href="https://gtfs.org/">General Transit Feed Specification</a>) data which is provided by Chicago Transit Authority (CTA) from <a href="https://developers.google.com/transit/gtfs">Google's GTFS feeds</a>. Documentation defines the format and structure of the files that comprise a GTFS dataset: <a href="https://developers.google.com/transit/gtfs/reference?csw=1">https://developers.google.com/transit/gtfs/reference?csw=1</a>.
2) `HealthCare` contains shapefiles describing HIV healthcare providers in Chicago and Illinois respectively. The services come from <a href="https://locator.hiv.gov/">Locator.HIV.gov</a>.
3) `PopData` contains population data for Chicago and Illinois respectively. Data come from The American Community Survey and <a href="https://map.aidsvu.org/map">AIDSVu</a>. AIDSVu (https://map.aidsvu.org/map) provides data on PLWH in Chicago at the census tract level for the year 2017 and in the State of Illinois at the county level for the year 2016. The American Community Survey (ACS) provided the number of people aged 15 to 64 at the census tract level for the year 2017 and at the county level for the year 2016. The ACS provides annually updated information on demographic and socio economic characteristics of people and housing in the U.S.
4) `RoadNetwork` contains the road networks for Chicago and Illinois respectively from <a href="https://www.openstreetmap.org/copyright">OpenStreetMap</a> using the Python <a href="https://osmnx.readthedocs.io/en/stable/">osmnx</a> package.
<b>The abstract for our paper is:</b>
Accomplishing the goals outlined in “Ending the HIV (Human Immunodeficiency Virus) Epidemic: A Plan for America Initiative” will require properly estimating and increasing access to HIV testing, treatment, and prevention services. In this research, a computational spatial method for estimating access was applied to measure distance to services from all points of a city or state while considering the size of the population in need for services as well as both driving and public transportation. Specifically, this study employed the enhanced two-step floating catchment area (E2SFCA) method to measure spatial accessibility to HIV testing, treatment (i.e., Ryan White HIV/AIDS program), and prevention (i.e., Pre-Exposure Prophylaxis [PrEP]) services. The method considered the spatial location of MSM (Men Who have Sex with Men), PLWH (People Living with HIV), and the general adult population 15-64 depending on what HIV services the U.S. Centers for Disease Control (CDC) recommends for each group. The study delineated service- and population-specific accessibility maps, demonstrating the method’s utility by analyzing data corresponding to the city of Chicago and the state of Illinois. Findings indicated health disparities in the south and the northwest of Chicago and particular areas in Illinois, as well as unique health disparities for public transportation compared to driving. The methodology details and computer code are shared for use in research and public policy.
keywords:
HIV;spatial accessibility;spatial analysis;public transportation;GIS
published:
2024-04-15
Lyu, Zhiheng; Lehan, Yao; Zhisheng, Wang; Chang, Qian; Zuochen, Wang; Jiahui, Li; Yufeng, Wang; Qian, Chen
(2024)
The dataset contains trajectories of Pt nanoparticles in 1.98 mM NaBH4 and NaCl, tracked under liquid-phase TEM. The coordinates (x, y) of nanoparticles are provided, together with the conversion factor that translates pixel size to actual distance. In the file, ∆t denotes the time interval and NaN indicates the absence of a value when the nanoparticle has not emerged or been tracked. The labeling of nanoparticles in the paper is also noted in the second row of the file.
keywords:
nanomotor; liquid-phase TEM
published:
2022-03-23
Wang, Junren; Karakoc, Deniz Berfin; Konar, Megan
(2022)
This dataset is a estimation of county-to-county commodity delivery through cold chain in 2017.
For each county pair, the weight[kg] and value[$] of the cold chain flow between origin and destination for SCTG 5 and SCTG 7 commodities are estimated by our model.
- SCTG 5 - Meat, poultry, fish, seafood, and their preparations
- SCTG 7 - Other prepared foodstuffs, fats, and oils
keywords:
food flows; cold chain; county-scale; United States; carbon footprint
published:
2021-02-28
Ghosh, Sudipta; Riemer, Nicole; Giuliani, Graziano; Giorgi , Filippo; Ganguly, Dilip; Dey, Sagnik
(2021)
This dataset contains the RegCM4 simulations used in the article " Implementation of dynamic ageing of carbonaceous aerosols in regional climate model RegCM". This dataset was used to investigate the impact of a new aging parameterisation scheme implemented in a regional climate model RegCM4. The dataset contains two sets of simulations: Expt_fix and Expt_dyn. It consists of the seasonal mean and daily mean values of the variables that were used to create the visualizations of this study. The Expt_fix and Expt_dyn dataset contain 34 and 38 NetCDF files, respectively. The CERES_vs_2expts_new.mat file is the comparison between CERES shortwave downward flux at the surface and same model outputs from two experiments for clear sky and all sky conditions.
--------------------------------------------------
The following information about the dataset was generated on 2021-01-08 by SUDIPTA GHOSH
<b>GENERAL INFORMATION</b>
<i>1. Date of data collection (single date, range, approximate date):</i> 2019-01-01 to 2019-12-31
<i>2. Geographic location of data collection:</i> Urbana-Champaign,Illinois, USA
<i>3. Information about funding sources that supported the collection of the data:</i> This work is supported by the MoEFCC under the NCAP-COALESCE project [Grant No. 14/10/2014-CC]. The first author acknowledges DST-INSPIRE fellowship [IF150055] and Fulbright-Kalam Climate Doctoral fellowship. N. R. acknowledges funding from NSF AGS-1254428 and DOE grant DE-SC0019192. Department of Science and Technology, Funds for Improvement of Science and Technology infrastructure in universities and higher educational institutions (DST-FIST) grant (SR/FST/ESII-016/2014) are acknowledged for the computing support.
<b>DATA & FILE OVERVIEW</b>
<i>1. File List:</i> Expt_fix and Expt_dyn datasets contain the analysed seasonal means and daily means of the variables that have been used to create the visualizations of this study. Each of the Expt_fix and Expt_dyn datasets contains 34 and 38 NetCDF files, respectively.
<i>2. Relationship between files, if important:</i> NA
<i>3. Additional related data collected that was not included in the current data package:</i> No
<b>METHODOLOGICAL INFORMATION</b>
<i>1. Description of methods used for collection/generation of data: </i>
The model RegCM4 code is freely available online from <a href="http://gforge.ictp.it/gf/project/regcm/">http://gforge.ictp.it/gf/project/regcm/</a>.
The anthropogenic aerosol emissions considered for the simulations are taken from IIASA inventory. The data used can be easily accessed online <a href="http://clima-dods.ictp.it/regcm4/">http://clima-dods.ictp.it/regcm4/</a> website.
TRMM observed precipitation data can be assessed from <a href="https://giovanni.gsfc.nasa.gov/giovanni/">https://giovanni.gsfc.nasa.gov/giovanni/</a> website.
CRU temperature data is available at <a href="https://crudata.uea.ac.uk/cru/data/hrg/">https://crudata.uea.ac.uk/cru/data/hrg/</a>.
CERES satellite surface shortwave downward fluxes are available at <a href="https://ceres.larc.nasa.gov/data/">https://ceres.larc.nasa.gov/data/</a> website.
Input files for the RegCM4 model are archived in <a href="http://clima-dods.ictp.it/regcm4/">http://clima-dods.ictp.it/regcm4/</a> website.
This dataset contains the RegCM4 simulations used in the article " Implementation of dynamic ageing of carbonaceous aerosols in regional climate model RegCM ". Two sets of simulations: Expt_fix and Expt_dyn consists of the output data . This dataset only contains the analysed seasonal mean and daily mean of the variables that have been used to create the visualizations of this study. Each of Expt_fix and Expt_dyn contains 34 and 38 NetCDF files respectively. This dataset was used to investigate the impact of a new aging parameterisation scheme implemented in a regional climate model RegCM4.
<i>2. Methods for processing the data:</i> Seasonal Mean and daily average values were extracted from 6-hourly model output.
<i>3. Instrument- or software-specific information needed to interpret the data:</i> CDO-1.7.1, Grads-2.0.a9, Matlab2016b
<i>4. Standards and calibration information, if appropriate:</i> NA
<i>5. Environmental/experimental conditions:</i> NA
<i>6. Describe any quality-assurance procedures performed on the data:</i> NA
<i>7. People involved with sample collection, processing, analysis and/or submission:</i> Sudipta Ghosh, Nicole Riemer, Graziano Giuliani, Filippo Giorgi, Dilip Ganguly, Sagnik Dey
<b>DATA-SPECIFIC INFORMATION FOR: Expt_fix_data.tar.gz</b>
<i>1. Number of variables:</i> 29
<i>2. Number of cases/rows:</i> NA
<i>3. Variable List:</i> Mass concentration (Kg m-3) of BC, BC_HB, BC_HL, OC, OC_HB, OC_HL; Columnar burden (mg m-2)] of BC, BC_HL, BC_HB, OC; Dry deposition flux (mg m-2 day-1) of BC_HB, BC_HL, OC_HB, OC_HL; Wet deposition flux due washout (mg m-2 day-1) of BC_HB, BC_HL, OC_HB, OC_HL; Wet deposition flux due to rainout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; AOD (unit less), precipitation (Kg m-2 s-1), temperature (K) , v-wind (m s-1), u-wind (m s-1), Surface shortwave downward flux (W m-2), Shortwave radiative forcing at the surface and top of atmosphere (W m-2)
<b>DATA-SPECIFIC INFORMATION FOR: Expt_dyn_data.tar.gz</b>
<i>1. Number of variables:</i> 30
<i>2. Number of cases/rows:</i> NA
<i>3. Variable List:</i> Mass concentration (Kg m-3) of BC, BC_HB, BC_HL, OC, OC_HB, OC_HL; Columnar burden (mg m-2)] of BC, BC_HL, BC_HB, OC; Dry deposition flux (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; Wet deposition flux due washout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; Wet deposition flux due to rainout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; AOD (unit less); precipitation (Kg m-2 s-1); temperature (K); v-wind (m s-1); u-wind (m s-1); Surface shortwave downward flux (W m-2); Shortwave radiative forcing at the surface and top of atmosphere (W m-2); ageingscale (s-1)
<b>DATA-SPECIFIC INFORMATION FOR: CERES_vs_2expts_new.mat</b>
<i>1. Number of variables:</i> 12
<i>2. Number of cases/rows:</i> NA
<i>3. Variable List:</i> Surface shortwave downward flux for clear sky (W/m-2) for CERES, Expt_fix, Expt_dyn (for winter JF and monsoon JJAS seasons); Surface shortwave downward flux for all sky conditions (W/m-2) for CERES, Expt_fix, Expt_dyn (for winter JF and monsoon JJAS seasons).
<b>NOTE:</b> The following information applies for all three (3) files:
<i> Missing data codes:</i> NA
<i>Specialized formats or other abbreviations used:</i> NA
keywords:
Carbonaceous aerosols; ageing parameterisation scheme; regional climate model; NetCDF
published:
2025-04-17
Mollenhauer, Michael; Pfaff, Wolfgang
(2025)
This dataset includes analysis code used to analyze the data involved with swapping photons between superconducting qubits in separate modules though a superconducting coaxial cable bus. The dataset includes Python code to model and plot the data, CAD designs of the modules that hold the superconducting qubits, high frequency simulation software files to model the electric fields of the superconducting circuits
keywords:
superconducting qubits; qunatum information; modular architecture
published:
2025-05-27
Rani, Sonia; Cao, Xi; Baptista, Alejandro E.; Hoffmann, Axel; Pfaff, Wolfgang
(2025)
This dataset contains all raw and processed data used to generate the figures in the main text and supplementary material of the paper "High dynamic-range quantum sensing of magnons and their dynamics using a superconducting qubit." The data can be used to reproduce the plots and validate the analysis. Accompanying Jupyter notebooks provide step-by-step analysis pipelines for figure generation. The dataset also includes drawings for the mechanical samples used to perform the experiment. In addition, the dataset provides ANSYS HFSS electromagnetic simulation files used to design and analyze the resonator structures and estimate field distributions.
keywords:
superconducting qubit; magnon sensing; hybrid quantum systems; spin-photon coupling; magnon decay; cavity QED
published:
2020-01-27
Morphologic data of dunes in the World's big rivers. Morphologic descriptors for large dunes include: dune height, dune mean leeside angle, dune maximum leeside angle, dune wavelength, dune flow depth (at the crest), and the fractional height of the maximum slope on the leeside for each dune. Morphologic descriptors for small dunes include: dune height, dune mean leeside angle, dune maximum leeside angle, dune wavelength, and dune flow depth (at the crest).
keywords:
dune; bedform; rivers; morphology;
published:
2025-07-31
Gibson, Jared; Jiang, Zhanzhi; Kou, Angela
(2025)
This repository includes data files and analysis and plotting codes for reproducing the figures in the paper "A scanning resonator for probing quantum coherent devices" arXiv:2506.22620
published:
2025-09-06
4D-STEM datasets for solution-treated (CrCoNi)93Al4Ti2Nb MEA in [111], [112], and [114] zone. Data used for Ultramicroscopy article "Differentiating electron diffuse scattering via 4D-STEM spatial fluctuation and correlation analysis in complex FCC alloys". Experiment details can be found in the paper. Data-specific details are listed in the Readme file.
keywords:
4D-STEM; MEA; Electron Diffuse-Scattering; FluCor
published:
2020-08-01
Horna Munoz, Daniel; Constantinescu, George; Rhoads, Bruce ; Lewis, Quinn; Sukhodolov, Alexander
(2020)
This data set shows how density effects have an important influence on mixing at a small river confluence. The data consist of results of simulations using a detached eddy simulation model.
keywords:
confluence; flow dynamics; density effects
published:
2019-12-17
Zhang, Yujie; Araiza Bravo, Rodrigo; Chitambar, Eric; Lorenz, Virginia
(2019)
This dataset provides the raw data, code and related figures for the paper, "Channel Activation of CHSH Nonlocality"
keywords:
Super-activation; Non-locality breaking channel
published:
2020-08-01
Rhoads, Bruce ; Lewis, Quinn; Sukhodolov, Alexander; Constantinescu, George
(2020)
This data set includes information used to determine patterns of mixing at three small confluences in East Central Illinois based on differences in the temperature or turbidity of the two confluent flows.
keywords:
mixing; confluences; flow structure
published:
2023-01-05
This is the data used in the paper "Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data". A preprint may be found at https://doi.org/10.48550/arXiv.2212.11367
Code from the Github repository https://github.com/adtonks/mosquito_GNN can be used with the data here to reproduce the paper's results. v1.0.0 of the code is also archived at https://doi.org/10.5281/zenodo.7897830
keywords:
west nile virus; machine learning; gnn; mosquito; trap; graph neural network; illinois; geospatial
published:
2024-05-30
Lyu, Fangzheng; Zhou, Lixuanwu; Park, Jinwoo; Baig, Furqan; Wang, Shaowen
(2024)
This dataset contains all the datasets used in the study conducted for the research publication titled "Mapping dynamic human sentiments of heat exposure with location-based social media data". This paper develops a cyberGIS framework to analyze and visualize human sentiments of heat exposure dynamically based on near real-time location-based social media (LBSM) data. Large volumes and low-cost LBSM data, together with a content analysis algorithm based on natural language processing are used effectively to generate heat exposure maps from human sentiments on social media.
## What’s inside - A quick explanation of the components of the zip file
* US folder includes the shapefile corresponding to the United State with County as spatial unit
* Census_tract folder includes the shapefile corresponding to the Cook County with census tract as spatial unit
* data/data.txt includes instruction to retrieve the sample data either from Keeling or figshare
* geo/data20000.txt is the heat dictionary created in this paper, please refer to the corresponding publication to see the data creation process
Jupyter notebook and code attached to this publication can be found at: https://github.com/cybergis/real_time_heat_exposure_with_LBSMD
keywords:
CyberGIS; Heat Exposure; Location-based Social Media Data; Urban Heat
published:
2019-12-12
Kamuda, Mark; Huff, Kathryn
(2019)
This dataset contains gamma-ray spectra templates for a source interdiction and uranium enrichment measurement task. This dataset also contains Keras machine learning models trained using datasets created using these templates.
keywords:
gamma-ray spectroscopy; neural networks; machine learning; isotope identification; uranium enrichment; sodium iodide; NaI(Tl)
published:
2021-09-06
Airglow images and Meteor radar data used in the paper "Mesospheric gravity wave activity estimated via airglow imagery, multistatic meteor radar, and SABER data taken during the SIMONe–2018 campaign".
keywords:
airglow; meteor radar; gravity waves; momentum flux;
published:
2025-07-11
Zhixin, Zhang; Jinho, Lim; Haoyang, Ni; Jian-Min, Zuo; Axel, Hoffmann
(2025)
This dataset includes experimental data supporting the findings in the manuscript "Magnetostriction and Temperature Dependent Gilbert Damping in Boron Doped Fe80Ga20 Thin Films". It contains raw data for X-Ray diffraction, high resolution transmission electron microscopy, magnetic hysteresis loop measurement, magnetostriction measurement, and temperature dependent magnetic damping measurement.
keywords:
magnetostriction; magnetic damping; magnetoelasticity; magnon-phonon coupling
published:
2025-03-12
Jeong, Gangwon; Villa, Umberto; Park, Seonyeong; Anastasio, Mark A.
(2025)
References
- Jeong, Gangwon, Umberto Villa, and Mark A. Anastasio. "Revisiting the joint estimation of initial pressure and speed-of-sound distributions in photoacoustic computed tomography with consideration of canonical object constraints." Photoacoustics (2025): 100700.
- Park, Seonyeong, et al. "Stochastic three-dimensional numerical phantoms to enable computational studies in quantitative optoacoustic computed tomography of breast cancer." Journal of biomedical optics 28.6 (2023): 066002-066002.
Overview
- This dataset includes 80 two-dimensional slices extracted from 3D numerical breast phantoms (NBPs) for photoacoustic computed tomography (PACT) studies. The anatomical structures of these NBPs were obtained using tools from the Virtual Imaging Clinical Trial for Regulatory Evaluation (VICTRE) project. The methods used to modify and extend the VICTRE NBPs for use in PACT studies are described in the publication cited above.
- The NBPs in this dataset represent the following four ACR BI-RADS breast composition categories:
> Type A - The breast is almost entirely fatty
> Type B - There are scattered areas of fibroglandular density in the breast
> Type C - The breast is heterogeneously dense
> Type D - The breast is extremely dense
- Each 2D slice is taken from a different 3D NBP, ensuring that no more than one slice comes from any single phantom.
File Name Format
- Each data file is stored as a .mat file. The filenames follow this format: {type}{subject_id}.mat where{type} indicates the breast type (A, B, C, or D), and {subject_id} is a unique identifier assigned to each sample. For example, in the filename D510022534.mat, "D" represents the breast type, and "510022534" is the sample ID.
File Contents
- Each file contains the following variables:
> "type": Breast type
> "p0": Initial pressure distribution [Pa]
> "sos": Speed-of-sound map [mm/μs]
> "att": Acoustic attenuation (power-law prefactor) map [dB/ MHzʸ mm]
> "y": power-law exponent
> "pressure_lossless": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, under the assumption of a lossless medium (corresponding to Studies I, II, and III).
> "pressure_lossy": Simulated noiseless pressure data obtained by numerically solving the first-order acoustic wave equation using the k-space pseudospectral method, incorporating a power-law acoustic absorption model to account for medium losses (corresponding to Study IV).
* The pressure data were simulated using a ring-array transducer that consists of 512 receiving elements uniformly distributed along a ring with a radius of 72 mm.
* Note: These pressure data are noiseless simulations. In Studies II–IV of the referenced paper, additive Gaussian i.i.d. noise were added to the measurement data. Users may add similar noise to the provided data as needed for their own studies.
- In Study I, all spatial maps (e.g., sos) have dimensions of 512 × 512 pixels, with a pixel size of 0.32 mm × 0.32 mm.
- In Study II and Study III, all spatial maps (sos) have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm.
- In Study IV, both the sos and att maps have dimensions of 1024 × 1024 pixels, with a pixel size of 0.16 mm × 0.16 mm.
keywords:
Medical imaging; Photoacoustic computed tomography; Numerical phantom; Joint reconstruction