Datasets (18)

Subject Area

Technology and Engineering (18)
Life Sciences (0)
Social Sciences (0)
Physical Sciences (0)
Uncategorized (0)

Funder

U.S. National Science Foundation (NSF) (7)
Other (6)
U.S. Department of Energy (DOE) (2)
U.S. National Institutes of Health (NIH) (0)
U.S. Department of Agriculture (USDA) (0)
Illinois Department of Natural Resources (IDNR) (0)
U.S. National Aeronautics and Space Administration (NASA) (0)
U.S. Geological Survey (USGS) (0)

Publication Year

2017 (8)
2018 (7)
2016 (3)
2019 (0)

License

CC0 (10)
CC BY (7)
custom (1)
published: 2018-11-20
 
A dataset of acoustic impulse responses for microphones worn on the body. Microphones were placed at 80 positions on the body of a human subject and a plastic mannequin. The impulse responses can be used to study the acoustic effects of the body and can be convolved with sound sources to simulate wearable audio devices and microphone arrays. The dataset also includes measurements with different articles of clothing covering some of the microphones and with microphones placed on different hats and accessories. The measurements were performed from 24 angles of arrival in an acoustically treated laboratory. All impulse responses are sampled at 48 kHz and truncated to 500 ms. The impulse response data is provided in WAVE audio and MATLAB data file formats. The microphone locations are provided in tab-separated-value files for each experiment and are also depicted graphically in the documentation. The file wearable_mic_dataset_full.zip contains both WAVE- and MATLAB-format impulse responses. The file wearable_mic_dataset_matlab.zip contains only MATLAB-format impulse responses. The file wearable_mic_dataset_wave.zip contains only WAVE-format impulse responses.
keywords: Acoustic impulse responses; microphone arrays; wearables; hearing aids; audio source separation
published: 2018-11-18
 
This dataset contains experimental measurements used in the paper, "Ultra-sensitivity of Numerical Landscape Evolution Models to their Initial Conditions." (to be submitted). The data is taken from experimental runs in a miniature landscape model named the eXperimental Landscape Evolution (XLE) facility. In this facility, we complete five >24hr runs at 5 minute temporal resolution. Every five minutes, an planform image was capture, and a digital elevation model (DEM) was generated. For each run, images and a corresponding animation of images are documented. In addition,ASCII formatted DEMs along with color hillshade maps were generated. The hillshade map images were also made into an animation. At the time of uploading this dataset, the paper has not been published. A link to the paper will be added at the time of publication.
keywords: landscape evolution model; digital elevation model; geomorphology
published: 2018-10-03
 
This dataset is the result of three crawls of the web performed in May 2018. The data contains raw crawl data and instrumentation captured by OpenWPM-Mobile, as well as analysis that identifies which scripts access mobile sensors, which ones perform some of browser fingerprinting, as well as clustering of scripts based on their intended use. The dataset is described in the included README.md file; more details about the methodology can be found in our ACM CCS'18 paper: Anupam Das, Gunes Acar, Nikita Borisov, Amogh Pradeep. The Web's Sixth Sense: A Study of Scripts Accessing Smartphone Sensors. In Proceedings of the 25th ACM Conference on Computer and Communications Security (CCS), Toronto, Canada, October 15–19, 2018. (Forthcoming)
keywords: mobile sensors; web crawls; browser fingerprinting; javascript
published: 2018-06-06
 
DNDC scripts and outputs that were generated as a part of the research publication 'Evaluation of DeNitrification DeComposition Model for Estimating Ammonia Fluxes from Chemical Fertilizer Application'.
keywords: DNDC; REA; ammonia emissions; fertilizers; uncertainty analysis
published: 2018-06-05
 
A complete building coverage area dataset (i.e. area occupied by building structures, excluding other built surfaces such as roads, parking lots, and public parks) at the level of census block groups for the contiguous United States (CONUS). The dataset was assembled based on an ensemble prediction of nonlinear hierarchical models to account for spatial heterogeneities in the distribution of built surfaces across different urban communities. Percentage of impervious land and housing density were used as predictors of the estimated area of buildings and cross-validation results showed that the product estimated area represented by buildings with a mean error of 0.049 %.
keywords: Building Coverage Area; Urban Geography; Regional; Sustainability; US Census Block Groups; CONUS Data
published: 2017-12-01
 
This dataset contains all the numerical results (digital elevation models) that are presented in the paper "Landscape evolution models using the stream power incision model show unrealistic behavior when m/n equals 0.5." The paper can be found at: http://www.earth-surf-dynam-discuss.net/esurf-2017-15/ The paper has been accepted, but the most up to date version may not be available at the link above. If so, please contact Jeffrey Kwang at jeffskwang@gmail.com to obtain the most up to date manuscript.
keywords: landscape evolution models; digital elelvation model
published: 2017-12-20
 
The dataset contains processed model fields used to generate data, figures and tables in the Journal of Geophysical Research article "Investigating the linear dependence of direct and indirect radiative forcing on emission of carbonaceous aerosols in a global climate model." The processed data are monthly averaged cloud properties (CCN, CDNC and LWP) and forcing variables (DRF and IRF) at original CAM5 spatial resolution (1.9° by 2.5°). Raw model output fields from CAM5 simulations are available through NERSC upon request. Please find more detailed information in the ReadMe file.
keywords: carbonaceous aerosols; radiative forcing; emission; linearity
published: 2016-06-23
 
This dataset contains hourly traffic estimates (speeds) for individual links of the New York City road network for the years 2010-2013, estimated from New York City Taxis.
keywords: traffic estimates; traffic conditions; New York City
published: 2017-11-14
 
If you use this dataset, please cite the IJRR data paper (bibtex is below). We present a dataset collected from a canoe along the Sangamon River in Illinois. The canoe was equipped with a stereo camera, an IMU, and a GPS device, which provide visual data suitable for stereo or monocular applications, inertial measurements, and position data for ground truth. We recorded a canoe trip up and down the river for 44 minutes covering 2.7 km round trip. The dataset adds to those previously recorded in unstructured environments and is unique in that it is recorded on a river, which provides its own set of challenges and constraints that are described in this paper. The data is divided into subsets, which can be downloaded individually. Video previews are available on Youtube: https://www.youtube.com/channel/UCOU9e7xxqmL_s4QX6jsGZSw The information below can also be found in the README files provided in the 527 dataset and each of its subsets. The purpose of this document is to assist researchers in using this dataset. Images ====== Raw --- The raw images are stored in the cam0 and cam1 directories in bmp format. They are bayered images that need to be debayered and undistorted before they are used. The camera parameters for these images can be found in camchain-imucam.yaml. Note that the camera intrinsics describe a 1600x1200 resolution image, so the focal length and center pixel coordinates must be scaled by 0.5 before they are used. The distortion coefficients remain the same even for the scaled images. The camera to imu tranformation matrix is also in this file. cam0/ refers to the left camera, and cam1/ refers to the right camera. Rectified --------- Stereo rectified, undistorted, row-aligned, debayered images are stored in the rectified/ directory in the same way as the raw images except that they are in png format. The params.yaml file contains the projection and rotation matrices necessary to use these images. The resolution of these parameters do not need to be scaled as is necessary for the raw images. params.yml ---------- The stereo rectification parameters. R0,R1,P0,P1, and Q correspond to the outputs of the OpenCV stereoRectify function except that 1s and 2s are replaced by 0s and 1s, respectively. R0: The rectifying rotation matrix of the left camera. R1: The rectifying rotation matrix of the right camera. P0: The projection matrix of the left camera. P1: The projection matrix of the right camera. Q: Disparity to depth mapping matrix T_cam_imu: Transformation matrix for a point in the IMU frame to the left camera frame. camchain-imucam.yaml -------------------- The camera intrinsic and extrinsic parameters and the camera to IMU transformation usable with the raw images. T_cam_imu: Transformation matrix for a point in the IMU frame to the camera frame. distortion_coeffs: lens distortion coefficients using the radial tangential model. intrinsics: focal length x, focal length y, principal point x, principal point y resolution: resolution of calibration. Scale the intrinsics for use with the raw 800x600 images. The distortion coefficients do not change when the image is scaled. T_cn_cnm1: Transformation matrix from the right camera to the left camera. Sensors ------- Here, each message in name.csv is described ###rawimus### time # GPS time in seconds message name # rawimus acceleration_z # m/s^2 IMU uses right-forward-up coordinates -acceleration_y # m/s^2 acceleration_x # m/s^2 angular_rate_z # rad/s IMU uses right-forward-up coordinates -angular_rate_y # rad/s angular_rate_x # rad/s ###IMG### time # GPS time in seconds message name # IMG left image filename right image filename ###inspvas### time # GPS time in seconds message name # inspvas latitude longitude altitude # ellipsoidal height WGS84 in meters north velocity # m/s east velocity # m/s up velocity # m/s roll # right hand rotation about y axis in degrees pitch # right hand rotation about x axis in degrees azimuth # left hand rotation about z axis in degrees clockwise from north ###inscovs### time # GPS time in seconds message name # inscovs position covariance # 9 values xx,xy,xz,yx,yy,yz,zx,zy,zz m^2 attitude covariance # 9 values xx,xy,xz,yx,yy,yz,zx,zy,zz deg^2 velocity covariance # 9 values xx,xy,xz,yx,yy,yz,zx,zy,zz (m/s)^2 ###bestutm### time # GPS time in seconds message name # bestutm utm zone # numerical zone utm character # alphabetical zone northing # m easting # m height # m above mean sea level Camera logs ----------- The files name.cam0 and name.cam1 are text files that correspond to cameras 0 and 1, respectively. The columns are defined by: unused: The first column is all 1s and can be ignored. software frame number: This number increments at the end of every iteration of the software loop. camera frame number: This number is generated by the camera and increments each time the shutter is triggered. The software and camera frame numbers do not have to start at the same value, but if the difference between the initial and final values is not the same, it suggests that frames may have been dropped. camera timestamp: This is the cameras internal timestamp of the frame capture in units of 100 milliseconds. PC timestamp: This is the PC time of arrival of the image. name.kml -------- The kml file is a mapping file that can be read by software such as Google Earth. It contains the recorded GPS trajectory. name.unicsv ----------- This is a csv file of the GPS trajectory in UTM coordinates that can be read by gpsbabel, software for manipulating GPS paths. @article{doi:10.1177/0278364917751842, author = {Martin Miller and Soon-Jo Chung and Seth Hutchinson}, title ={The Visual–Inertial Canoe Dataset}, journal = {The International Journal of Robotics Research}, volume = {37}, number = {1}, pages = {13-20}, year = {2018}, doi = {10.1177/0278364917751842}, URL = {https://doi.org/10.1177/0278364917751842}, eprint = {https://doi.org/10.1177/0278364917751842} }
keywords: slam;sangamon;river;illinois;canoe;gps;imu;stereo;monocular;vision;inertial
published: 2017-10-10
 
This dataset contains ground motion data for Newmark Structural Engineering Laboratory (NSEL) Report Series 048, "Modification of ground motions for use in Central North America: Southern Illinois surface ground motions for structural analysis". The data are 20 individual ground motion time history records developed at each of the 10 sites (for a total of 200 ground motions). These accompanying ground motions are developed following the detailed procedure presented in Kozak et al. [2017].
keywords: earthquake engineering; ground motion records; southern Illinois seismic hazard; dynamic structural analysis; conditional mean spectrum
published: 2017-07-29
 
This dataset contains the PartMC-MOSAIC simulations used in the article “Plume-exit modeling to determine cloud condensation nuclei activity of aerosols from residential biofuel combustion”. The data is organized as a set of folders, each folder representing a different scenario modeled. Each folder contains a series of NetCDF files, which are the output of the PartMC-MOSAIC simulation. They contain information on particle and gas properties, both of the biofuel burning plume and background. Input files for PartMC-MOSAIC are also included. This dataset was used during the open review process at Atmospheric Chemistry and Physics (ACP) and supports both the discussion paper and final article.
keywords: CCN; cloud condensation nuclei; activation; supersaturation; biofuel
published: 2017-05-01
 
Indianapolis Int'l Airport to Urbana: Sampling Rate: 2 Hz Total Travel Time: 5901534 ms or 98.4 minutes Number of Data Points: 11805 Distance Traveled: 124 miles via I-74 Device used: Samsung Galaxy S6 Date Recorded: 2016-11-27 Parameters Recorded: * ACCELEROMETER X (m/s²) * ACCELEROMETER Y (m/s²) * ACCELEROMETER Z (m/s²) * GRAVITY X (m/s²) * GRAVITY Y (m/s²) * GRAVITY Z (m/s²) * LINEAR ACCELERATION X (m/s²) * LINEAR ACCELERATION Y (m/s²) * LINEAR ACCELERATION Z (m/s²) * GYROSCOPE X (rad/s) * GYROSCOPE Y (rad/s) * GYROSCOPE Z (rad/s) * LIGHT (lux) * MAGNETIC FIELD X (microT) * MAGNETIC FIELD Y (microT) * MAGNETIC FIELD Z (microT) * ORIENTATION Z (azimuth °) * ORIENTATION X (pitch °) * ORIENTATION Y (roll °) * PROXIMITY (i) * ATMOSPHERIC PRESSURE (hPa) * SOUND LEVEL (dB) * LOCATION Latitude * LOCATION Longitude * LOCATION Altitude (m) * LOCATION Altitude-google (m) * LOCATION Altitude-atmospheric pressure (m) * LOCATION Speed (kph) * LOCATION Accuracy (m) * LOCATION ORIENTATION (°) * Satellites in range * GPS NMEA * Time since start in ms * Current time in YYYY-MO-DD HH-MI-SS_SSS format Quality Notes: There are some things to note about the quality of this data set that you may want to consider while doing preprocessing. This dataset was taken continuously as a single trip, no stop was made for gas along the way making this a very long continuous dataset. It starts in the parking lot of the Indianapolis International Airport and continues directly towards a gas station on Lincoln Avenue in Urbana, IL. There are a couple parts of the trip where the phones orientation had to be changed because my navigation cut out. These times are easy to account for based on Orientation X/Y/Z change. I would also advise cutting out the first couple hundred points or the points leading up to highway speed. The phone was mounted in the cupholder in the front seat of the car.
keywords: smartphone; sensor; driving; accelerometer; gyroscope; magnetometer; gps; nmea; barometer; satellite
published: 2017-02-28
 
Leesburg, VA to Indianapolis, Indiana: Sampling Rate: 0.1 Hz Total Travel Time: 31100007 ms or 518 minutes or 8.6 hours Distance Traveled: 570 miles via I-70 Number of Data Points: 3112 Device used: Samsung Galaxy S4 Date Recorded: 2017-01-15 Parameters Recorded: * ACCELEROMETER X (m/s²) * ACCELEROMETER Y (m/s²) * ACCELEROMETER Z (m/s²) * GRAVITY X (m/s²) * GRAVITY Y (m/s²) * GRAVITY Z (m/s²) * LINEAR ACCELERATION X (m/s²) * LINEAR ACCELERATION Y (m/s²) * LINEAR ACCELERATION Z (m/s²) * GYROSCOPE X (rad/s) * GYROSCOPE Y (rad/s) * GYROSCOPE Z (rad/s) * LIGHT (lux) * MAGNETIC FIELD X (microT) * MAGNETIC FIELD Y (microT) * MAGNETIC FIELD Z (microT) * ORIENTATION Z (azimuth °) * ORIENTATION X (pitch °) * ORIENTATION Y (roll °) * PROXIMITY (i) * ATMOSPHERIC PRESSURE (hPa) * Relative Humidity (%) * Temperature (F) * SOUND LEVEL (dB) * LOCATION Latitude * LOCATION Longitude * LOCATION Altitude (m) * LOCATION Altitude-google (m) * LOCATION Altitude-atmospheric pressure (m) * LOCATION Speed (kph) * LOCATION Accuracy (m) * LOCATION ORIENTATION (°) * Satellites in range * GPS NMEA * Time since start in ms * Current time in YYYY-MO-DD HH-MI-SS_SSS format Quality Notes: There are some things to note about the quality of this data set that you may want to consider while doing preprocessing. This dataset was taken continuously but had multiple stops to refuel (without the data recording ceasing). This can be removed by parsing out all data that has a speed of 0. The mount for this dataset was fairly stable (as can be seen by the consistent orientation angle throughout the dataset). It was mounted tightly between two seats in the back of the vehicle. Unfortunately, the frequency for this dataset was set fairly low at one per ten seconds.
keywords: smartphone; sensor; driving; accelerometer; gyroscope; magnetometer; gps; nmea; barometer; satellite; temperature; humidity
published: 2016-12-20
 
Scripts and example data for AIDData (aiddata.org) processing in support of forthcoming Nakamura dissertation. This dataset includes two sets of scripts and example data files from an aiddata.org data dump. Fuller documentation about the functionality for these scripts is within the readme file. Additional background information and description of usage will be in the forthcoming Nakamura dissertation (link will be added when available). Data originally supplied by Nakamura. Python code and this readme file created by Wickes. Data included within this deposit are examples to demonstrate execution. Roughly, there are two python scripts in here: keyword_search.py, designed to assist in finding records matching specific keywords, and matching_tool.ipynb, designed to assist in detection of which records are and are not contained within a keyword results file and an aiddata project data file.
keywords: aiddata; natural resources
published: 2016-05-19
 
This dataset contains records of four years of taxi operations in New York City and includes 697,622,444 trips. Each trip records the pickup and drop-off dates, times, and coordinates, as well as the metered distance reported by the taximeter. The trip data also includes fields such as the taxi medallion number, fare amount, and tip amount. The dataset was obtained through a Freedom of Information Law request from the New York City Taxi and Limousine Commission. The files in this dataset are optimized for use with the ‘decompress.py’ script included in this dataset. This file has additional documentation and contact information that may be of help if you run into trouble accessing the content of the zip files.
keywords: taxi;transportation;New York City;GPS