Dataset Search

Displaying 1 - 25 of 91 in total

Filters

Subject Area

Technology and Engineering (91)

Life Sciences (0)

Physical Sciences (0)

Social Sciences (0)

Uncategorized

Arts and Humanities (0)

Funder

U.S. National Science Foundation (NSF) (35)

Other (29)

U.S. Department of Energy (DOE) (10)

U.S. National Institutes of Health (NIH) (8)

U.S. National Aeronautics and Space Administration (NASA) (1)

U.S. Department of Agriculture (USDA) (0)

Illinois Department of Natural Resources (IDNR) (0)

U.S. Geological Survey (USGS) (0)

Illinois Department of Transportation (IDOT) (0)

U.S. Army (0)

Publication Year

2025 (14)

2024 (10)

2017 (9)

2022 (9)

2026 (9)

2021 (8)

2023 (8)

2018 (7)

2019 (6)

2016 (3)

2020 (3)

2009 (1)

2011 (1)

2012 (1)

2014 (1)

2015 (1)

License

CC0 (44)

CC BY (43)

custom (4)

Illinois Data Bank Dataset Search Results

Results

published: 2026-05-07

Curated PI3K (Phosphoinositide 3-kinase) Network

Park, Minhyuk; Chacko, George (2026)

This network is a curated version of a network created by harvesting citing and cited articles around Whitman et al. (1988) Nature, 332(6165):644–646. For further details refer to <a href="https://databank.illinois.edu/datasets/IDB-4897629">https://databank.illinois.edu/datasets/IDB-4897629</a>. Curation was performed by removing nodes (articles identified by Dimensions publication ids) whose year or DOI record was missing from the Dimensions database and retaining the largest connected component of the resulting network. This curated network represents the largest connected component. Integer ids were generated by the authors to replace the Dimensions ids. Access to the raw data requires a license from Digital Science. The original pi3k network contains 17,970,340 nodes of which only 17,508,111 (97.42%) them have both year and DOI information. In this curated version, 127,255,020 edges were reduced to 125,118,817 edges (98.32%). The edges are represented with two columns in the file where the "source_iid" column represents the citing node and "target_iid" column represents the cited node. Restricting the original pi3k network to only those nodes with both year and DOI information results in a graph that has 21,469 connected components where the largest connected component has 17,486,619 nodes (97.31%) . Thus, this network represents 97.31% of the nodes and 98.32% of the edges in the original network. The authors thank Digital Science for supporting this project through access to the Dimensions database.

keywords: pi3k citation network;

published: 2026-05-06

Data for "Modeling citations and cartels"

Park, Minhyuk; Yi, Haotian; Chen, Ian; Warnow, Tandy; Chacko, George (2026)

The dataset contains sample data from those generated for the manuscript "Modeling citations and cartels" by Park et al. (2026), who describe the use of the SASCA-ReSA agent-based model to simulate the growth of citation networks and mimic citation cartels through simulations. The manuscript is presently under review. SASCA-ReSA s the latest stage in a series of progressively complex models of citation dynamics (Chacko et al. 2026 Applied Network Science, Park et al 2025 Proceedings of the XIV International Conference on Complex Networks and their Applications , Park et al 2025 MetaRoR). The model is implemented for high performance computing environments and all the results were generated on the Illinois Campus Cluster. The standard simulation reported in this manuscript results in roughly 1.2M nodes. The input to a simulation is a seed network, a configuration file, and real-world distributions for number of references made per article, and the count of authors per article. The output of a simulation is a larger citation network that includes the input network. Details of the model are described in the manuscript and instructions on how to use the software are available on the SASCA-ReSA GIthub site. We have included annotated nodelists from three different simulations. a) bsl1 (bsl1.csv.tar.xz): has 1,193,102 rows, output of a standard simulation. b) p5_1 (p5_1.csv.tar.xz): has 1,193,102 rows, output of a standard simulation with 5 agents "planted" in year 1 of the simulation. c) ps5_1 (ps5_1.csv.tar.xz): has 1,193,102 rows, output of a standard simulation with one agent planted in each of the first five years of a simulation. d) sample_config.ini: contains configuration parameters for a simulation e) louvain.parquet.gz: has 160,714,032 rows, with two columns: node_id, and cluster_id with header row data representing a louvain clustering of the ABM161 network (<a href "https://doi.org/10.13012/B2IDB-9265079_V1">https://doi.org/10.13012/B2IDB-9265079_V1</a>). Generated using the louvain module from through kuzu and compressed using to_parquet module of pandas with gzip internal compression. The largest cluster (cluster id 5) has 81,675,241 nodes. This network was generated under the SASCA-ReS model.

keywords: citation dynamics; agent-based models

published: 2026-04-29

Anatomy-Matched Multi-Variant Stochastic Optoacoustic Numerical Breast Phantoms and Simulated OAT Measurement Data

Park, Seonyeong; Jeong, Gangwon; Villa, Umberto; Anastasio, Mark (2026)

This dataset is a subset of a companion dataset to the manuscript: Seonyeong Park, Gangwon Jeong, Umberto Villa, Mark A. Anastasio, "A Virtual Imaging Framework for Three-Dimensional Quantitative Optoacoustic Tomography Using Stochastic Numerical Breast Phantoms," arXiv preprint arXiv:2510.00189 (2025) <a href="https://doi.org/10.48550/arXiv.2510.00189">https://doi.org/10.48550/arXiv.2510.00189</a> This subset was specifically used in the following publication: Refik Mert Cam, Seonyeong Park, Umberto Villa, Mark A. Anastasio, "Application of a Virtual Imaging Framework for Investigating a Deep Learning-based Reconstruction Method for 3D Quantitative Photoacoustic Computed Tomography," Photoacoustics 100792 (2025) <a href="https://doi.org/10.1016/j.pacs.2025.100792">https://doi.org/10.1016/j.pacs.2025.100792</a> The dataset contains 64 sets of three-dimensional (3D) numerical breast phantoms (NBPs) for use in virtual imaging studies of optoacoustic tomography (OAT), along with the corresponding simulated multi-wavelength optical fluence distributions, induced initial pressure distributions, and OAT measurement data. Each set corresponds to a distinct breast anatomy and includes four anatomy-matched variants: (i) a healthy breast with Fitzpatrick skin tone 1, and (ii-iv) lesion-inserted breasts with Fitzpatrick skin tones 1, 3, and 5. More detailed information is provided in the accompanying README.txt file.

keywords: Virtual imaging; In silico imaging; Numerical breast phantoms; Optoacoustic tomography; Photoacoustic computed tomography; Breast imaging

published: 2016-05-19

New York City Taxi Trip Data (2010-2013)

Donovan, Brian; Work, Dan (2016)

This dataset contains records of four years of taxi operations in New York City and includes 697,622,444 trips. Each trip records the pickup and drop-off dates, times, and coordinates, as well as the metered distance reported by the taximeter. The trip data also includes fields such as the taxi medallion number, fare amount, and tip amount. The dataset was obtained through a Freedom of Information Law request from the New York City Taxi and Limousine Commission. The files in this dataset are optimized for use with the ‘decompress.py’ script included in this dataset. This file has additional documentation and contact information that may be of help if you run into trouble accessing the content of the zip files.

keywords: taxi;transportation;New York City;GPS

published: 2021-03-06

RaDICaL: A Synchronized FMCW Radar, Depth, IMU and RGB Camera Data Dataset with Low-Level FMCW Radar Signals (ROS bag format)

Lim, Teck Yian; Markowitz, Spencer Abraham; Do, Minh (2021)

This dataset consists of raw ADC readings from a 3 transmitter 4 receiver 77GHz FMCW radar, together with synchronized RGB camera and depth (active stereo) measurements. The data is grouped into 4 distinct radar configurations: - "indoor" configuration with range <14m - "30m" with range <38m - "50m" with range <63m - "high_res" with doppler resolution of 0.043m/s # Related code https://github.com/moodoki/radical_sdk # Hardware Project Page https://publish.illinois.edu/radicaldata

keywords: radar; FMCW; sensor-fusion; autonomous driving; dataset; RGB-D; object detection; odometry

published: 2026-03-20

Data for Joint inversions of coded and uncoded long pulse F-region ISR returns measured at Arecibo

Wu, Yulun; Kudeki, Erhan (2026)

Arecibo ISR CLP/ULP/LULP ion-line spectra obtained from USRP receiver with 500 kHz bandwidth and 120-1400 km altitude range, experiment dates September 23-26, 2016. Used for Joint inversions of coded and uncoded long pulse1 F-region ISR returns measured at Arecibo.

keywords: Remote sensing; Incoherent scatter radar; Arecibo Observatory

published: 2024-02-16

Parsed Open Citations and PubMed Data

Mohasel Arjomandi, Hossein; Korobskiy, Dmitriy; Chacko, George (2024)

This dataset contains five files. (i) open_citations_jan2024_pub_ids.csv.gz, open_citations_jan2024_iid_el.csv.gz, open_citations_jan2024_el.csv.gz, and open_citation_jan2024_pubs.csv.gz represent a conversion of Open Citations to an edge list using integer ids assigned by us. The integer ids can be mapped to omids, pmids, and dois using the open_citation_jan2024_pubs.csv and open_citations_jan2024_pub_ids.scv files. The network consists of 121,052,490 nodes and 1,962,840,983 edges. Code for generating these data can be found https://github.com/chackoge/ERNIE_Plus/tree/master/OpenCitations. (ii) The fifth file, baseline2024.csv.gz, provides information about the metadata of PubMed papers. A 2024 version of PubMed was downloaded using Entrez and parsed into a table restricted to records that contain a pmid, a doi, and has a title and an abstract. A value of 1 in columns indicates that the information exists in metadata and a zero indicates otherwise. Code for generating this data: https://github.com/illinois-or-research-analytics/pubmed_etl. If you use these data or code in your work, please cite https://doi.org/10.13012/B2IDB-5216575_V1.

keywords: PubMed

published: 2023-03-16

Data For Well-Connected Communities In Real Networks.

Park, Minhyuk; Tabatabaee, Yasamin; Warnow, Tandy; Chacko, George (2023)

Curated networks and clustering output from the manuscript: Well-Connected Communities in Real-World Networks https://arxiv.org/abs/2303.02813

keywords: Community detection; clustering; open citations; scientometrics; bibliometrics

published: 2024-06-04

Data for Well-Connectedness and Community Detection

Park, Minhyuk; Tabatabaee, Yasamin; Warnow, Tandy; Chacko, George (2024)

This dataset contains files and relevant metadata for real-world and synthetic LFR networks used in the manuscript "Well-Connectedness and Community Detection (2024) Park et al. presently under review at PLOS Complex Systems. The manuscript is an extended version of Park, M. et al. (2024). Identifying Well-Connected Communities in Real-World and Synthetic Networks. In Complex Networks & Their Applications XII. COMPLEX NETWORKS 2023. Studies in Computational Intelligence, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-031-53499-7_1. “The Overview of Real-World Networks image provides high-level information about the seven real-world networks. TSVs of the seven real-world networks are provided as [network-name]_cleaned to indicate that duplicated edges and self-loops were removed, where column 1 is source and column 2 is target. LFR datasets are contained within the zipped file. Real-world networks are labeled _cleaned_ to indicate that duplicate edges and self loops were removed. #LFR datasets for the Connectivity Modifier (CM) paper ### File organization Each directory `[network-name]_[resolution-value]_lfr` includes the following files: * `network.dat`: LFR network edge-list * `community.dat`: LFR ground-truth communities * `time_seed.dat`: time seed used in the LFR software * `statistics.dat`: statistics generated by the LFR software * `cmd.stat`: command used to run the LFR software as well as time and memory usage information

published: 2026-02-09

PI-3Kinase Citation Network

Park, Minhyuk; Chacko, George (2026)

This dataset consists of a directed network in edge list format where nodes correspond to articles in the scientific literature and edges represent citations. The network was constructed by seed set expansion (two rounds of citing and cited papers ) of the article (seed node) reporting the discovery of PI 3-Kinase activity. " Malcolm Whitman, C Peter Downes, Marilyn Keeler, Tracy Keller, and Lewis Cantley. (1988) Type I phosphatidylinositol kinase makes a novel inositol phospholipid, phosphatidylinositol-3-phosphate. Nature, 332(6165):644–646." The edge list comprises 17,970,340 nodes and 127,255,020 edges. The dataset was obtained from the Dimensions database via a two-level expansion of the seed node (article). The first expansion included four groups of nodes: the seed node; all publications cited by the seed node; all publications citing the seed node; and all publications cited by publications citing the seed node. The second expansion included all nodes that either cited or were cited by a node in the first expansion set. Node ids used were converted from the proprietary identifiers in Dimensions using a zero-based sequence of integer_ids [0: (n-1)]. Access to the original identifiers requires a license from Digital Science.

published: 2024-07-29

A Citation Graph from OpenAlex (Works)

Caetano Machado Lopes, Lorran; Chacko, George (2024)

This dataset consists of a citation graph. It was constructed by downloading and parsing the Works section of the Open Alex catalog of the global research system. Open Alex (see citation below) contains detailed information about scholarly research, including articles, authors, journals, institutions, and their relationships. The data were downloaded on 2024-07-15. The dataset comprises two compressed (.xz) files. 1) filename: openalexID_integer_id_hasDOI.parquet.xz. The tabular data within contains three columns: openalex_id, integer_id, and hasDOI. Each row represents a record with the following data types: • openalex_id: A unique identifier from the Open Alex catalog. • integer_id: An integer representing the new identifier (assigned by the authors) • hasDOI: An integer (0 or 1) indicating whether the record has a DOI (0 for no, 1 for yes). 2) filename: citation_table.tsv.xz This edgelist of citations has two columns (no header) of integer values that represent citing and cited integer_id, respectively. Summary Features • Total Nodes (Documents): 256,997,006 • Total Edges (citations): 2,148,871,058 • Documents with DOIs: 163,495,446 • Edges between documents with DOIs: 1,936,722,541 [corrected to 2,148,788,148 edges Nov 13, 2025] • Count of unique nodes in edgelist 111,453,719 [updated Nov 13, 2025] Note: Nov 13, 2025. An improved curation process will be applied to a future version of this dataset Note: Nov 13, 2025. The code used to generate these files can be found here: https://github.com/illinois-or-research-analytics/lorran_openalex/

keywords: citation networks; Open Alex

published: 2024-12-11

Pretrained models for MMAudio

Cheng, Ho Kei (2024)

MMAudio pretrained models. These models can be used in the open-sourced codebase https://github.com/hkchengrex/MMAudio Note: mmaudio_large_44k_v2.pth and Readme.txt are added to this V2. Other 4 files stay the same.

published: 2026-02-19

Improving individual committor estimates and data efficiency in reaction coordinate tests with the Empirical Bayes method

Gurumoorthi, Akshay; Peters, Baron (2026)

The dataset contains a jupyter notebook intended for anyone who wants to apply the Empirical Bayes method described in the paper titled 'Data for Improving individual committor estimates and data efficiency in reaction coordinate tests with the Empirical Bayes method' to committor data with a simple and lucid python script.

published: 2026-02-11

Data for The MagPIE2 Dataset: Magnetic Field-Based Mapping, Localization, and SLAM

Hanley, David; Lee, Jongwon; Choi, Su Yeon; Bretl, Timothy (2026)

If you use this dataset, please cite both the dataset and the associated data paper (bibtex is below). @ARTICLE{11386847, author={Hanley, David and Lee, Jongwon and Choi, Su Yeon and Bretl, Timothy}, journal={IEEE Transactions on Instrumentation and Measurement}, title={The MagPIE2 Dataset for Mapping, Localization, and Simultaneous Localization and Mapping Using Magnetic Fields}, year={2026}, volume={}, number={}, pages={1-1}, keywords={Magnetometers;Magnetic field measurement;Magnetic fields;Pedestrians;Location awareness;Buildings;Simultaneous localization and mapping;Measurement errors;Hardware;Calibration;Localization;mapping;SLAM;dataset;benchmark;magnetometer;magnetic field}, doi={10.1109/TIM.2026.3662919}} We present a dataset for the evaluation of magnetic field-based robotic and pedestrian localization, mapping, and SLAM methods. This dataset contains magnetometer and inertial measurement unit data collected from inside three buildings both a pedestrian and a ground robot. Data were collected at different heights simultaneously, both with and without changes in the placement of objects that may affect magnetometer measurements. In total, approximately 689 square meters of floor space was covered by this dataset. This dataset is archivally stored. We provide a GitHub site which is meant to serve as a forum to post issues with the dataset, share code using the dataset, and to resolve problems: <a href="https://github.com/hanley6/MagPIE2Forum">https://github.com/hanley6/MagPIE2Forum</a> Note that while the dataset is meant to be permanently stored, this forum is not meant to guarantee perennial support and its existence will be dependent on the policies of GitHub. How is the dataset organized? The data is divided into the following parts at a high level and more detailed information can be found in the Readme: 1. The walking portion of the dataset: CSL_WLK.zip, DCL_WLK.zip, Talbot_WLK.zip, and WLK_Misc.zip. 2. The robot portion of the dataset: Robot_Dataset.zip. 3. Motor interference tests: Motor_Interference_Test.zip. 4. Ground truth evaluation: Ground_Truth_Evaluation.zip. 5. Quick start results: Quick_Start_Results.zip. How is data recorded and stored? Data is generally collected in the form of ROS bag files. Each ROS bag has Intel Realsense camera images, magnetometer readings, IMU readings, timestamps, and more as applicable for each file in the dataset. Each bag file has an associated metadata file written as a YAML file. This contains general information about each bag file including the start and stop time, who collected the bag file (during the pedestrian portion of the dataset), and the approximate location where data was collected. In several cases, additional comma separated (csv) files of the dataset where included either as a convenient supplement to ROS bag files (e.g., csv files of magnetometer calibration data) or because they serve as human readable quick start results. How does one set up and run files on the dataset? The files are stored in ROS bags and are, therefore, meant to be run using the Robot Operating System. Information regarding how to use the Robot Operating System as well as installation instructions are available at: <a href="https://ros.org/">https://ros.org/</a>

keywords: Localization; mapping; SLAM; dataset; benchmark; magnetometer; magnetic field

published: 2025-12-23

study of liquid suction cup detachment mechanism

Aly, Abdallah; A. Saif, M. Taher (2025)

The uploaded data is part of the paper titled: Self-Modifying Percolation Governs Detachment in Soft Suction Wet Adhesion, which shows the detachment mechanism of liquid suction-based adhesion.

published: 2026-01-28

Data for Field-Effect Transistors from Artificial Charged Domain Walls in Stacked van der Waals Ferroelectric α-In2Se3

Nahid, Shahriar Muhammad; Dong, Haiyue; Nolan, Gillian; Nam, Sungwoo; Mason, Nadya; Huang, Pinshane; van der Zande, Arend (2026)

Room-temperature transfer curves; Benchmarking conductance; STEM images of charged domain walls; Temperature-dependent transfer curves; Scaling of conductance, hopping length, threshold voltage, trap density, and field-effect mobility with temperature; Magnetotransport data; Optical, AFM, and PFM image of different field-effect transistors; STEM images of contacts; Output and transfer curves of FETs; Additional STEM images of charged domain walls; Temperature scaling of subthreshold swing and threshold voltage difference; Comparison of maximum field-effect mobility for different structures

published: 2026-01-20

Dataset for "CAMUS: Scalable Phylogenetic Network Estimation"

Willson, James; Warnow, Tandy (2026)

Dataset from "CAMUS: Scalable Phylogenetic Network Estimation." This dataset contains simulated phylogenetic networks, gene trees, and sequence data. - camus-dataset.tar.xz is the main archive containing all the simulated data. More details about the files and directories it contains can be found in README.md - scripts.zip contains various scripts used in the simulation study.

keywords: evolution; computational biology; bioinformatics; phylogenetics

published: 2025-08-16

Data from development and evaluation of SASCA-s: Scalable Agent-based Simulator for Citation Analysis with simulation

Park, Minhyuk; Lamy, João AC; Rodrigues, Esther CC; Ferreira, Felipe Mariano; Vu-Le, The-Anh; Warnow, Tandy; Chacko, George (2025)

The data within consist of compressed output files in the form of edgelists (*.edgelist.gz) and nodelists (*.aux.parquet) from large citation network simulations using an agent-based model. The code and instructions are available at: <a href="https://github.com/illinois-or-research-analytics/SASCA">https://github.com/illinois-or-research-analytics/SASCA</a>. In addition, we provide a distribution of citation frequencies drawn from a random sample of PubMed journal articles (pooled_50k_pubmed_unique.csv) and a table of recencies- the frequency with which citations are made to the previous year, the year before that and so on (recency_probs_percent_stahl_filled.csv). A manuscript describing the SASCA-s simulator has been submitted for review and will be referenced in a future version of this data repository if it is accepted. The prefixes sj and er refer to the real world and Erdos-Renyi random graph respectively that were used to initiate simulations. These 'seed' networks are available from the Github site referenced above.

keywords: benchmark networks; agent-based models; simulation; citation

published: 2025-08-07

EC-SBM Benchmark Networks

Vu-Le, The-Anh; Chacko, George; Warnow, Tandy (2025)

Dataset generated using the technique described in "EC-SBM synthetic network generator". This contains multiple synthetic networks with ground-truth community structure, which can be used to evaluate community detection methods. Note: * networks.zip contains the synthetic networks

keywords: network science; synthetic networks; community detection; tsv

published: 2017-02-28

Smartphone recorded driving sensor data: Leesburg, VA to Indianapolis, IN

Freedman, Ryan (2017)

Leesburg, VA to Indianapolis, Indiana: Sampling Rate: 0.1 Hz Total Travel Time: 31100007 ms or 518 minutes or 8.6 hours Distance Traveled: 570 miles via I-70 Number of Data Points: 3112 Device used: Samsung Galaxy S4 Date Recorded: 2017-01-15 Parameters Recorded: * ACCELEROMETER X (m/s²) * ACCELEROMETER Y (m/s²) * ACCELEROMETER Z (m/s²) * GRAVITY X (m/s²) * GRAVITY Y (m/s²) * GRAVITY Z (m/s²) * LINEAR ACCELERATION X (m/s²) * LINEAR ACCELERATION Y (m/s²) * LINEAR ACCELERATION Z (m/s²) * GYROSCOPE X (rad/s) * GYROSCOPE Y (rad/s) * GYROSCOPE Z (rad/s) * LIGHT (lux) * MAGNETIC FIELD X (microT) * MAGNETIC FIELD Y (microT) * MAGNETIC FIELD Z (microT) * ORIENTATION Z (azimuth °) * ORIENTATION X (pitch °) * ORIENTATION Y (roll °) * PROXIMITY (i) * ATMOSPHERIC PRESSURE (hPa) * Relative Humidity (%) * Temperature (F) * SOUND LEVEL (dB) * LOCATION Latitude * LOCATION Longitude * LOCATION Altitude (m) * LOCATION Altitude-google (m) * LOCATION Altitude-atmospheric pressure (m) * LOCATION Speed (kph) * LOCATION Accuracy (m) * LOCATION ORIENTATION (°) * Satellites in range * GPS NMEA * Time since start in ms * Current time in YYYY-MO-DD HH-MI-SS_SSS format Quality Notes: There are some things to note about the quality of this data set that you may want to consider while doing preprocessing. This dataset was taken continuously but had multiple stops to refuel (without the data recording ceasing). This can be removed by parsing out all data that has a speed of 0. The mount for this dataset was fairly stable (as can be seen by the consistent orientation angle throughout the dataset). It was mounted tightly between two seats in the back of the vehicle. Unfortunately, the frequency for this dataset was set fairly low at one per ten seconds.

keywords: smartphone; sensor; driving; accelerometer; gyroscope; magnetometer; gps; nmea; barometer; satellite; temperature; humidity

published: 2017-05-01

Smartphone recorded driving sensor data: Indianapolis International Airport to Urbana, IL

Freedman, Ryan (2017)

Indianapolis Int'l Airport to Urbana: Sampling Rate: 2 Hz Total Travel Time: 5901534 ms or 98.4 minutes Number of Data Points: 11805 Distance Traveled: 124 miles via I-74 Device used: Samsung Galaxy S6 Date Recorded: 2016-11-27 Parameters Recorded: * ACCELEROMETER X (m/s²) * ACCELEROMETER Y (m/s²) * ACCELEROMETER Z (m/s²) * GRAVITY X (m/s²) * GRAVITY Y (m/s²) * GRAVITY Z (m/s²) * LINEAR ACCELERATION X (m/s²) * LINEAR ACCELERATION Y (m/s²) * LINEAR ACCELERATION Z (m/s²) * GYROSCOPE X (rad/s) * GYROSCOPE Y (rad/s) * GYROSCOPE Z (rad/s) * LIGHT (lux) * MAGNETIC FIELD X (microT) * MAGNETIC FIELD Y (microT) * MAGNETIC FIELD Z (microT) * ORIENTATION Z (azimuth °) * ORIENTATION X (pitch °) * ORIENTATION Y (roll °) * PROXIMITY (i) * ATMOSPHERIC PRESSURE (hPa) * SOUND LEVEL (dB) * LOCATION Latitude * LOCATION Longitude * LOCATION Altitude (m) * LOCATION Altitude-google (m) * LOCATION Altitude-atmospheric pressure (m) * LOCATION Speed (kph) * LOCATION Accuracy (m) * LOCATION ORIENTATION (°) * Satellites in range * GPS NMEA * Time since start in ms * Current time in YYYY-MO-DD HH-MI-SS_SSS format Quality Notes: There are some things to note about the quality of this data set that you may want to consider while doing preprocessing. This dataset was taken continuously as a single trip, no stop was made for gas along the way making this a very long continuous dataset. It starts in the parking lot of the Indianapolis International Airport and continues directly towards a gas station on Lincoln Avenue in Urbana, IL. There are a couple parts of the trip where the phones orientation had to be changed because my navigation cut out. These times are easy to account for based on Orientation X/Y/Z change. I would also advise cutting out the first couple hundred points or the points leading up to highway speed. The phone was mounted in the cupholder in the front seat of the car.

keywords: smartphone; sensor; driving; accelerometer; gyroscope; magnetometer; gps; nmea; barometer; satellite

published: 2025-12-29

Data for Mitigation of ion-temperature/composition ambiguity in the inversion of F-region ion-line spectra measured at Arecibo using coded long pulses

Wu, Yulun; Kudeki, Erhan (2025)

Arecibo ISR CLP ion-line spectra obtained from RI receiver with 500 kHz bandwidth and 120-640 km altitude range, experiment dates September 23-26, 2016. Used for Mitigation of ion-temperature/composition ambiguity in the inversion of F-region ion-line spectra measured at Arecibo using coded long pulses.

keywords: Remote sensing; Incoherent scatter radar; Arecibo Observatory

published: 2024-11-13

Nanoscale Stacking Fault Engineering and Mapping in Spinel Oxides for Reversible Multivalent Ion Insertion

Tang, Zhichu; Chen, Wenxiang; Yin, Kaijun; Busch, Robert; Hou, Hanyu; Lin, Oliver; Lyu, Zhiheng; Zhang, Cheng; Yang, Hong; Zuo, Jian-Min ; Chen, Qian (2024)

These datasets are for the four-dimensional scanning transmission electron microscopy (4D-STEM) and electron energy loss spectroscopy (EELS) experiments for cathode nanoparticles at different states. The raw 4D-STEM experiment datasets were collected by TEM image & analysis software (FEI) and were saved as SER files. The raw 4D-STEM datasets of SER files can be opened and viewed in MATLAB using our analysis software package of imToolBox available at https://github.com/flysteven/imToolBox. The raw EELS datasets were collected by DigitalMicrograph software and were saved as DM4 files. The raw EELS datasets can be opened and viewed in DigitalMicrograph software or using our analysis codes available at https://github.com/chenlabUIUC/OrientedPhaseDomain. All the datasets are from the work "Nanoscale Stacking Fault Engineering and Mapping in Spinel Oxides for Reversible Multivalent Ion Insertion" (2024). The 4D-STEM experiment data include four example datasets for cathode nanoparticles collected at pristine and discharged states. Each dataset contains a stack of diffraction patterns collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine untreated nanoparticle: "Pristine U-NP.ser" 2. Pristine 200ºC heated nanoparticle: "Pristine H200-NP.ser" 3. Untreated nanoparticle after first discharge in Zn-ion batteries: "Discharged U-NP.ser" 4. 200ºC heated nanoparticle after first discharge in Zn-ion batteries: "Discharged H200-NP.ser" The EELS experiment data includes six example datasets for cathode nanoparticles collected at different states (in "EELS datasets.zip") as described below. Each EELS dataset contains the zero-loss and core-loss EELS spectra collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine untreated nanoparticle: "Pristine U-NP EELS.zip" 2. Pristine 200ºC heated nanoparticle: "Prisitne H200-NP EELS.zip" 3. Untreated nanoparticle after first discharge in Zn-ion batteries: "Discharged U-NP EELS.zip" 4. Untreated nanoparticle after first charge in Zn-ion batteries: "Charged U-NP EELS.zip" 5. 200ºC heated nanoparticle after first discharge in Zn-ion batteries: "Discharged H200-NP EELS.zip" 6. 200ºC heated nanoparticle after first charge in Zn-ion batteries: "Charged H200-NP EELS.zip" The details of the software package and codes that can be used to analyze the 4D-STEM datasets and EELS datasets are available at: https://github.com/chenlabUIUC/OrientedPhaseDomain. Once our paper is formally published, we will update the relationship of these datasets with our paper.

keywords: 4D-STEM; EELS; defects; strain; cathode; nanoparticle; energy storage

published: 2022-04-29

Biological and Simulated datasets for testing the SCAMPP framework for phylogenetic placement methods

Wedell, Eleanor; Warnow, Tandy (2022)

Thank you for using these datasets! These files contain trees and reference alignments, as well as the selected query sequences for testing phylogenetic placement methods against and within the SCAMPP framework. There are four datasets from three different sources, each containing their source alignment and "true" tree, any estimated trees that may have been generated, and any re-estimated branch lengths that were created to be used with their requisite phylogenetic placement method. Three biological datasets (16S.B.ALL, PEWO/LTP_s128_SSU, and PEWO/green85) and one simulated dataset (nt78) is contained. See README.txt in each file for more information.

keywords: Phylogenetic Placement; Phylogenetics; Maximum Likelihood; pplacer; EPA-ng

published: 2022-11-11

Data for Chemical Short-Range Ordering in a CrCoNi Medium-Entropy Alloy

Hsiao, Haw-Wen; Zuo, Jian-Min (2022)

This dataset is for characterizing chemical short-range-ordering in CrCoNi medium entropy alloys. It has three sub-folders: 1. code, 2. sample WQ, 3. sample HT. The software needed to run the files is Gatan Microscopy Suite® (GMS). Please follow the instruction on this page to install the DM3 GMS: <a href="https://www.gatan.com/installation-instructions#Step1">https://www.gatan.com/installation-instructions#Step1</a> 1. Code folder contains three DM scripts to be installed in Gatan DigitalMicrograph software to analyze scanning electron nanobeam diffraction (SEND) dataset: Cepstrum.s: need [EF-SEND_sampleWQ_cropped_aligned.dm3] in Sample WQ and the average image from [EF-SEND_sampleWQ_cropped_aligned.dm3]. Same for Sample HT folder. log_BraggRemoval.s: same as above. Patterson.s: Need refined diffuse patterns in Sample HT folder. 2. Sample WQ and 3. Sample HT folders both contain the SEND data (.ser) and the binned SEND data (.dm3) as well as our calculated strain maps as the strain measurement reference. The Sample WQ folder additionally has atomic resolution STEM images; the Sample HT folder additionally has three refined diffuse patterns as references for diffraction data processing. * Only .ser file is needed to perform the strain measurement using imToolBox as listed in the manuscript. .emi file contains the meta data of the microscope, which can be opened together with .ser file using FEI TIA software.

keywords: Medium entropy alloy; CrCoNi; chemical short-range-ordering; CSRO; TEM