Displaying datasets 1 - 25 of 69 in total

Subject Area

Physical Sciences (69)
Life Sciences (0)
Social Sciences (0)
Technology and Engineering (0)
Uncategorized (0)
Arts and Humanities (0)


U.S. National Science Foundation (NSF) (29)
U.S. Department of Energy (DOE) (14)
Other (12)
U.S. National Aeronautics and Space Administration (NASA) (3)
U.S. National Institutes of Health (NIH) (3)
U.S. Geological Survey (USGS) (2)
Illinois Department of Transportation (IDOT) (1)
U.S. Department of Agriculture (USDA) (1)
Illinois Department of Natural Resources (IDNR) (0)
U.S. Army (0)

Publication Year

2021 (19)
2022 (15)
2019 (9)
2020 (9)
2016 (7)
2018 (7)
2017 (3)
2023 (0)


CC0 (34)
CC BY (33)
custom (2)
published: 2022-08-29
Example scripts and configuration files needed to perform select simulations described in the manuscript "Percolation transition prescribes protein size-specific barrier to passive transport through the nuclear pore complex."
keywords: Nuclear Pore Complex; simulation setup
published: 2022-08-06
This dataset consists of all the files and codes that are part of the manuscript (main text and supplement) titled "Spin-selective tunneling from nanowires of the candidate topological Kondo insulator SmB6". For detailed information on the individual files refer to the specific readme files.
keywords: Topology; Kondo Inuslator; Spin; Scanning tunneling microscopy; antiferromagnetism
published: 2022-06-22
This dataset helps to investigate the Spatial Accessibility to HIV Testing, Treatment, and Prevention Services in Illinois and Chicago, USA. The main components are: population data, healthcare data, GTFS feeds, and road network data. The core components are: 1) `GTFS` which contains GTFS (<a href="https://gtfs.org/">General Transit Feed Specification</a>) data which is provided by Chicago Transit Authority (CTA) from <a href="https://developers.google.com/transit/gtfs">Google's GTFS feeds</a>. Documentation defines the format and structure of the files that comprise a GTFS dataset: <a href="https://developers.google.com/transit/gtfs/reference?csw=1">https://developers.google.com/transit/gtfs/reference?csw=1</a>. 2) `HealthCare` contains shapefiles describing HIV healthcare providers in Chicago and Illinois respectively. The services come from <a href="https://locator.hiv.gov/">Locator.HIV.gov</a>. 3) `PopData` contains population data for Chicago and Illinois respectively. Data come from The American Community Survey and <a href="https://map.aidsvu.org/map">AIDSVu</a>. AIDSVu (https://map.aidsvu.org/map) provides data on PLWH in Chicago at the census tract level for the year 2017 and in the State of Illinois at the county level for the year 2016. The American Community Survey (ACS) provided the number of people aged 15 to 64 at the census tract level for the year 2017 and at the county level for the year 2016. The ACS provides annually updated information on demographic and socio economic characteristics of people and housing in the U.S. 4) `RoadNetwork` contains the road networks for Chicago and Illinois respectively from <a href="https://www.openstreetmap.org/copyright">OpenStreetMap</a> using the Python <a href="https://osmnx.readthedocs.io/en/stable/">osmnx</a> package. <b>The abstract for our paper is:</b> Accomplishing the goals outlined in “Ending the HIV (Human Immunodeficiency Virus) Epidemic: A Plan for America Initiative” will require properly estimating and increasing access to HIV testing, treatment, and prevention services. In this research, a computational spatial method for estimating access was applied to measure distance to services from all points of a city or state while considering the size of the population in need for services as well as both driving and public transportation. Specifically, this study employed the enhanced two-step floating catchment area (E2SFCA) method to measure spatial accessibility to HIV testing, treatment (i.e., Ryan White HIV/AIDS program), and prevention (i.e., Pre-Exposure Prophylaxis [PrEP]) services. The method considered the spatial location of MSM (Men Who have Sex with Men), PLWH (People Living with HIV), and the general adult population 15-64 depending on what HIV services the U.S. Centers for Disease Control (CDC) recommends for each group. The study delineated service- and population-specific accessibility maps, demonstrating the method’s utility by analyzing data corresponding to the city of Chicago and the state of Illinois. Findings indicated health disparities in the south and the northwest of Chicago and particular areas in Illinois, as well as unique health disparities for public transportation compared to driving. The methodology details and computer code are shared for use in research and public policy.
keywords: HIV;spatial accessibility;spatial analysis;public transportation;GIS
published: 2022-03-20
Data for "Generic character of charge and spin density waves in superconducting cuprates". - Neutron scattering data for SDW - RSXS scans of CDW of LESCO x=0.10, 0.125, 0.15, 0.17, 0.20 at various temperatures. - Temperature dependence of CDW peak intensity, correlation length, Qcdw (Lorentzian fit, S(q,T) fit, Landau-Ginzburg fit) - XAS data of LESCO x=0.10, 0.125, 0.15, 0.17, 0.20
published: 2022-04-15
This dataset is provided to support the statements in Kim, H., and R.Y. Makhnenko. 2022. "Evaluation of CO2 sealing potential of heterogeneous Eau Claire shale". Journal of the Geological Society. In geologic carbon dioxide (CO2) storage in deep saline aquifers, buoyant CO2 tends to float upwards in the reservoirs overlaid by low permeable formations called caprocks. Caprocks should serve as barriers to potential CO2 leakage that can happen through a diffusion loss and permeation through faults, fractures, or pore spaces. The leakage through intact caprock would mainly depend on its permeability and CO2 breakthrough pressure, and is affected by the heterogeneities in the material. Here, we study the sealing potential of a caprock from Illinois Basin - Eau Claire shale, with sandy and shaly fractions distinguished via electron microscopy and grain/pore size and surface area characterization. The direct measurements of permeability of sandy shale provides the values ~ 10-15 m2, while clayey specimens are three orders of magnitude less permeable. The CO2 breakthrough pressure under in-situ stress conditions is 0.1 MPa for the sandy shale and 0.4 MPa for the clayey counterpart – these values are higher than those predicted by the porosimetry methods performed on the unconfined specimens. Sandy Eau Claire shale would allow penetration of large CO2 volumes at low overpressures, while the clayey formation can serve as a caprock in the absence of faults and fractures in it.
keywords: Geologic carbon storage; Caprock; Shale; CO2 breakthrough pressure; Porosimetry.
published: 2022-05-26
The data files are for the paper entitled: Long-lifetime spin excitations near domain walls in 1T-TaS2 to be published in PNAS. The data was obtained on a 300 mK custom designed Unisoku scanning tunneling microscope using the Nanonis module. All the data files have been named based on the Figure numbers that they represent.
keywords: Mott Insulator; Spins; Charge Density Wave; Domain walls; Long lifetime
published: 2022-04-19
This data repository includes the features and the trained backbone parameters used in the ICLR 2022 Paper "On the Importance of Firth Bias Reduction in Few-Shot Classification". The code accompanying this data is open-source and available at https://github.com/ehsansaleh/firth_bias_reduction The code and the data have three modules: 1. The "code_firth" module (10 files) relates to the basic ResNet backbones and logistic classifiers (e.g., Figures 2 and 3 in the main paper). 2. The "code_s2m2rf" module (2 files) relates to the S2M2R feature backbones and cosine classifiers (e.g., Figure 4 in the main paper). 3. The "code_dcf" module (3 files) relates to the few-shot Distribution Calibration (DC) method (e.g., Table 1 in the main paper). The relevant files for each module have the module name as a prefix in their name. 1. For instance, the "code_dcf_features.tar" file should be placed at the "features" directory of the "code_dcf" module. 2. As another example, "code_firth_features_cifarfs_novel.tar" should be placed in the "features" directory of the "code_firth" module, and it includes the features extracted from the novel split of mini-ImageNet dataset. Each tar-ball should be extracted in its relevant directory, and the md5 check-sums of the extracted files are also provided in the open-source code repository for verification. Please note that the actual datasets of images are not included here (since we do not own those datasets). However, helper scripts for automatically downloading the original datasets are also provided in the every module and sub-directory of the GitHub code repository.
keywords: Computer Vision; Few-Shot Classification; Few-Shot Learning; Firth Bias Reduction
published: 2022-04-11
This data set contains all the map data used for "Quantifying transportation energy vulnerability and its spatial patterns in the United States". The multiple dimensions (i.e., exposure, sensitivity, adaptive capacity) of transportation energy vulnerability (TEV) at the census tract level in the United States, the changes in TEV with electric vehicles adoption, and the detailed data for Chicago, Los Angeles, and New York are in the dataset.
keywords: Transport energy; Vulnerability; Fuel costs; Electric vehicles
published: 2022-03-25
Ground based radar data sets collected during the 2013 NASA EVEX Campaign conducted in Roi-Namur island of the Kwajalein Atoll in the Republic of Marshall Islands are deposited in this databank. Radar data were collected with IRIS VHF and ALTAIR VHF/UHF systems.
published: 2022-03-23
This dataset is a estimation of county-to-county commodity delivery through cold chain in 2017. For each county pair, the weight[kg] and value[$] of the cold chain flow between origin and destination for SCTG 5 and SCTG 7 commodities are estimated by our model. - SCTG 5 - Meat, poultry, fish, seafood, and their preparations - SCTG 7 - Other prepared foodstuffs, fats, and oils
keywords: food flows; cold chain; county-scale; United States; carbon footprint
published: 2022-02-14
This dataset contains simulation results from numerical model PartMC-MOSAIC used in the article "Quantifying the effects of mixing state on aerosol optical properties". This article is submitted to the journal Atmospheric Physics and Chemistry. There are total 100 scenario directories in this dataset, denoted from 00-99. Each scenario contains 25 NetCDF files hourly output from PartMC-MOSAIC simulations containing the simulated gas and particle information. The data was produced using version 2.5.0 of PartMC-MOSAIC. Instructions to compile and run PartMC-MOSAIC are available at https://github.com/compdyn/partmc. The chemistry code MOSAIC is available by request from Rahul.Zaveri@pnl.gov. For more details of reproducing the cases, please contact nriemer@illinois.edu and yuyao3@illinois.edu.
keywords: Aerosol mixing state; Aerosol optical properties; Mie calculation; Black Carbon
published: 2022-02-07
This dataset provides estimates of agricultural and food commodity flows [kg] between all county pairs within the United States for the years 2007, 2012, and 2017. The database provides 206.3 million data points, since pairwise information is provided between 3134 counties, for 7 commodity categories, and 3 time periods. The commodity categories correspond to the Standardized Classification of Transported Goods and are: - SCTG 1: Iive animals and fish - SCTG 2: cereal grains - SCTG 3: agricultural products (except for animal feed, cereal grains, and forage products) - SCTG 4: animal feed, eggs, honey, and other products of animal origin - SCTG 5: meat, poultry, fish, seafood, and their preparations - SCTG 6: milled grain products and preparations, and bakery products - SCTG 7: other prepared foodstuffs, fats and oils For additional information, please see the related paper by Karakoc et al. (2022) in Environmental Research Letters.
keywords: food flows; high-resolution; county-scale; time-series; United States
has sharing link
published: 2022-01-31
This dataset contains results from WRF simulations over northern South America. The Orinoco Low-Level Jet (OLLJ) and the Cross-Equatorial Moisture Transport are important circulation structures of the climate of tropical South America. We explore the sensitivity of the OLLJ and cross-equatorial transport to the representation of surface fluxes and turbulence by using two different Land Surface Model (LSM) schemes (Noah and CLM) and three Planetary Boundary Layer (PBL) schemes (YSU, QNSE and MYNN).
keywords: WRF; Orinoco LLJ; preicpitation
published: 2021-11-23
This dataset contains simulation results from PartMC-MOSAIC-CAPRAM used in the article ”Eval- uating the impacts of cloud processing on resuspended aerosol particles after cloud evaporation using a particle-resolved model”. In this V2, there are eight folders: one for urban plume simulation to provide the initial particle population for cloud processing, the other four folders are for the four cloud cycles simulated and the last two are for the coagulation cases. Within the urban plume simulation, there are 25 NetCDF files hourly output from PartMC-MOSAIC simulations containing the gas and particle information. Within the four cloud cycle folders, there are 25 subdirectories that contain the cloud processing results for aerosol population from urban plume environment. For each subdirectory, there are 31 NetCDF files out- put every minute from PartMC-MOSAIC-CAPRAM simulations containing aerosol and gas information after aqueous chemistry. Another two folders are for the cases considering Brownian coagulation and sedimentation coalescence. Each contained 93 NetCDF files, produced from repeating the 30-minutes simulations for three times to consider the coagulation randomness. The low polluted case folder includes the simulated cloud processing results for 25 urban plume cases with less aerosol number concentration. This dataset was used to investigate the effects of cloud processing on aerosol mixing state and CCN properties.
keywords: cloud process; coagulation; aqueous chemistry; aerosol mixing state; CCN
published: 2021-11-04
This dataset contains all the data for the results section in the study presented in the paper entitled "Chemistry Across Multiple Phases (CAMP) version 1.0: An integrated multi-phase chemistry mode" submitted to Geoscientific Model Development (GMD). In this paper, two sets of simulations were run to test CAMP with this results included here. This consists of (1) box model inputs and outputs presented in Section 4.2 for modal, binned and particle-resolved simulations to compare the application of identical chemical mechanisms to different aerosol representations and (2) the 3D Eulerian output presented in Section 4.3.
keywords: Atmospheric chemistry; Aerosols and particles; Numerical Modeling
published: 2021-10-13
Drainage network analysis is fundamental to understanding the characteristics of surface hydrology. Based on elevation data, drainage network analysis is often used to extract key hydrological features like drainage networks and streamlines. Limited by raster-based data models, conventional drainage network algorithms typically allow water to flow in 4 or 8 directions (surrounding grids) from a raster grid. To resolve this limitation, this paper describes a new vector-based method for drainage network analysis that allows water to flow in any direction around each location. The method is enabled by rapid advances in Light Detection and Ranging (LiDAR) remote sensing and high-performance computing. The drainage network analysis is conducted using a high-density point cloud instead of Digital Elevation Models (DEMs) at coarse resolutions. Our computational experiments show that the vector-based method can better capture water flows without limiting the number of directions due to imprecise DEMs. Our case study applies the method to Rowan County watershed, North Carolina in the US. After comparing the drainage networks and streamlines detected with corresponding reference data from US Geological Survey generated from the Geonet software, we find that the new method performs well in capturing the characteristics of water flows on landscape surfaces in order to form an accurate drainage network. This dataset contains all the code, notebooks, datasets used in the study conducted for the research publication titled " A Vector-Based Method for Drainage Network Analysis Based on LiDAR Data ". ## What's Inside A quick explanation of the components * `A Vector Approach to Drainage Network Analysis Based on LiDAR Data.ipynb` is a notebook for finding the drainage network based on LiDAR data *`Picture1.png` is a picture representing the pseudocode of our new algorithm * HPC` folder contains codes for running the algorithm with sbatch in HPC ** `execute.sh` is a bash script file that use sbatch to conduct large scale analysis for the algorithm ** `run.sh` is a bash script file that calls the script file `execute.sh` for large scale calculation for the algorithm ** `run.py` includes the codes implemented for the algorithm * `Rowan Creek Data` includes data that are used in the study ** `3_1.las` and `3_2.las ` are the LiDAR data files that is used in our analysis presented in the paper. Users may use this data file to reproduce our results and may replace it with their own LiDAR file to run this method over different areas ** `reference` folder includes reference data from USGS *** `reference_3_1.tif` and `reference_3_2.tif` are reference data for the drainage system analysis retrieved from USGS.
keywords: CyberGIS; Drainage System Analysis; LiDAR
published: 2021-10-04
This dataset contains all the necessary information to recreate the study presented in the paper entitled "Learning coagulation processes with combinatorially-invariant neural networks". This consists of (1) the aggregated output files used for machine learning, (2) the machine learning codes used to learn the presented models, (3) the PartMC model source code that was used to generate the simulation data and (4) the Python scripts used construct the scenario library for training and testing simulations. This data was used to investigate a method (combinatorally-invariant neural network) for learning the aerosol process of coagulation. This data may be useful for application of other methods.
keywords: Machine learning; Atmospheric chemistry; Particle-resolved modeling; Coagulation; Atmospheric Science
published: 2021-08-15
This data set contains mass spectrometry data used for the publication "mspack: efficient lossless and lossy mass spectrometry data compression".
keywords: mass-spectrometry data; compression; proteomics
published: 2021-08-04
This dataset contains data derived from large-scale particle velocimetry measurements obtained at the confluence of the Saline Branch and an unnamed tributary in Illinois. The data were collected using two cameras positioned about the confluence, one mounted on a cable and the other mounted on a tripod. A description of the content of the files can be found in Description of Files.rtf.
keywords: confluence; hydrodynamics; LSPIV; flow structure; stagnation
published: 2021-06-17
Model output dataset (6-hourly) from the Weather Research and Forecasting (WRF) model simulations over South America with the added capability of water vapor tracers to track the moisture that originates over the Amazon and the La Plata river basins. The simulations were performed for the period 2003-2013 at 20-km horizontal resolution fully coupled with the Noah-MP land surface model. Limited number of original output variables sufficient for reproducing the analyses in papers that cite this dataset are included here. The attached wrfout_southamerica_readme.txt contains detailed information about the file format and variables. For the complete model dataset, contact francina@illinois.edu.
keywords: WRF; Amazon; La Plata; South America; Numerical tracers
published: 2021-05-14
This is the complete dataset for the "Anomalous density fluctuations in a strange metal" Proceedings of the National Academy of Sciences publication (https://doi.org/10.1073/pnas.1721495115). This is an integration of the Zenodo dataset which includes raw M-EELS data. <b>METHODOLOGICAL INFORMATION</b> 1. Description of methods used for collection/generation of data: Data have been collected with a M-EELS instrument and according to the data acquisition protocol described in the original PNAS publication and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026) 2. Methods for processing the data: Raw data were collected with a channeltron-based M-EELS apparatus described in the reference PNAS publication and analyzed according to the procedure outlined both in the PNAS paper and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026). The raw M-EELS spectra at each momentum have been subject to minor data processing involving: (a) averaging of different acquisitions at the same conditions, (b) energy binning, (c) division of an effective Coulomb matrix element (which yields a structure factor S(q,\omega)), (d) antisymmetrization (which yields the imaginary chi) All these procedures are described in the PNAS paper. 3. Instrument- or software-specific information needed to interpret the data: These data are simple .txt or .dat files which can be read with any standard data analysis software, notably Python notebooks, MatLab, Origin, IgorPro, and others. We do not include scripts in order to provide maximum flexibility. 4. Relationship between files, if important: We divided in different folders raw data, structure factors and imaginary chi. <b>DATA-SPECIFIC INFORMATION</b> There are 8 folders within the Data_public_deposition_v1.zip. Each folder contain data needed to create the corresponding figure in the publication. <b>1. Fig1:</b> This folder contains 21 DAT files needed to plot the theory data in panels C and D, following this naming conventions: [chiA]or[chiB]or[Pi]_q_number.dat With chiA is the imaginary RPA charge susceptibility with a Coulomb interaction of electronically weakly coupled layers chiB is the imaginary RPA charge susceptibility with the usual 4\pi e^2/q^2 Coulomb interaction. Pi is the imaginary Lindhard polarizability. q is momentum in reciprocal lattice units Number is the numerical momentum value in reciprocal lattice units <b>2. Fig2:</b> Files needed to plot Fig. 2 of the PNAS paper. Contains 3 folders as listed below. The files in this folder are named following this convention: Bi2212_295K_(1,-1)_50eV_161107_q_number_2.16_avg.dat, 295K is the sample temperature (1,-1) is the momentum direction in reciprocal lattice units 50 eV is the incident e beam energy 161107 is the start date of the experiment in yymmdd format Q is the momentum Number is the momentum in reciprocal lattice units 2.16 is the energy range covered by the data in eV Avg identifies averaged data ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>3. Fig3:</b> Files needed to plot Fig. 3 of the PNAS paper. OP/ OD prefix identifies optimally doped or overdosed sample data, respectively. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>4. Fig4:</b> Files needed to plot Fig. 4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>5. FigS1:</b> Files needed to plot Fig. S1 of the PNAS paper. There are 5 files in this folder. DAT files are M-EELS data following the prior naming convention, while the two .txt files are digitized data from N. Nücker, U. Eckern, J. Fink, and P. Müller, Long-Wavelength Collective Excitations of Charge Carriers in High-Tc Superconductors, Phys. Rev. B 44, 7155(R) (1991), and K. H. G. Schulte, The interplay of Spectroscopy and Correlated Materials, Ph.D. thesis, University of Groningen (2002). <b>6. FigS2:</b> Files needed to plot Fig. S2 of the PNAS paper. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>7. FigS3:</b> Files needed to plot Fig. S3 of the PNAS paper. There are 2 files in this folder: 20K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 20 K 295K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 295 K <b>8. FigS4:</b> Files needed to plot Fig. S4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra
keywords: Momentum resolved electron energy loss spectroscopy (M-EELS); cuprates; plasmons; strange metal
published: 2021-05-10
This dataset contains the emulated global multi-model urban daily temperature projections under RCP 8.5 scenario. The dataset is derived from the study "Large model structural uncertainty in global projections of urban heat waves" (XXXX). Details about this dataset and the local urban climate emulator are described in the article. This dataset documents the global urban daily temperatures of 17 CMIP5 Earth system models for 2006-2015 and 2061-2070. This dataset may be useful for multiple communities regarding urban climate change, heat waves, impacts, vulnerability, risks, and adaptation applications.
keywords: Urban heat waves; CMIP; urban warming; heat stress; urban climate change
published: 2021-04-29
Global assessments of climate extremes typically do not account for the unique characteristics of individual crops. A consistent definition of the exposure of specific crops to extreme weather would enable agriculturally-relevant hazard quantification. We introduce the Agriculturally-Relevant Exposure to Shocks (ARES) model, a novel database of both the temperature and moisture extremes facing individual crops by explicitly accounting for crop characteristics. Specifically, we estimate crop-specific temperature and moisture shocks during the growing season for a 0.25-degree spatial grid and daily time scale from 1961-2014 globally for 17 crops. The resulting database presented here provides annual crop- and event-specific exposure rates. Both gridded and country-level exposure rates are provided for each of the 17 crops. Our results provide new insights into the changes in the magnitude as well as spatial and temporal distribution of extreme events that impact crops over the past half-century. For additional information, please see the related paper by Jackson et al. (2021) in Environmental Research Letters.
keywords: Crop-specific; weather extremes; temperature; moisture; global; gridded; time series