Datasets (49)

Funder

U.S. National Science Foundation (NSF) (10)
Other (5)
U.S. National Institutes of Health (NIH) (2)
U.S. Department of Energy (DOE) (1)
Illinois Department of Natural Resources (IDNR) (1)
U.S. Department of Agriculture (USDA) (1)

License

CC0 (33)
CC BY (15)
custom (1)
published: 2017-06-16
 
Table S1. Pollen types identified in the BCI and PNSL pollen rain data sets. Pollen types were identified to species when possible and assigned a life form based on descriptions provided in Croat, T.B. (1978). Taxa from BCI and PNSL were assigned a 1 if present in forest census data or a 0 if absent. The relative representation of each taxon has been provided for each extended record and by dry and wet season representation respectively. CA loadings are provided for axes 1 and 2 (Fig. 1).
keywords: pollen; identifications; abundance; data; BCI; PNSL; Panama
published: 2017-06-16
 
Table S3. Mean slope response for each predictive model used in the ecoinformatic analysis. Mean responses are provided for each seasonal and annual pollen data set analyzed from BCI and PNSL and are summarized by life form. Calculated p-values are provided for each model.
keywords: pollen; response; climate; ecoinformatics; BCI; PNSL; Panama
published: 2017-06-16
 
Table S2. Raw pollen counts and climatic data for each seasonal sampling period. Climatic data reflects the average daily conditions observed over the duration samples were collected (˚C/day, mm/day, MJ/m2/day). Lycopodium counts and counts for each pollen taxon reflect the aggregated pollen sum from four sampling heights.
keywords: pollen; count; climate; data; BCI; PNSL; Panama
published: 2017-06-15
 
Datasets used in the study, "Optimal completion of incomplete gene trees in polynomial time using OCTAL," presented at WABI 2017.
keywords: phylogenomics; missing data; coalescent-based species tree estimation; gene trees
published: 2017-05-31
 
Dataset includes maternal antigen treatment and early-life antigen treatment for male zebra finches. Also includes data on beak coloration, measures of song complexity for each male, and female responses to treated males. Male beak color and song metadata: * MATID= Maternal Identity * MATTRT=Maternal antigen treatment prior to egg laying (KLH=keyhole limpet hemocyanin, LPS= lipopolysaccharide, PBS=phosphate buffered saline) * YGTRT= Young antigen treatment post-hatch (KLH=keyhole limpet hemocyanin, LPS= lipopolysaccharide, PBS=phosphate buffered saline)) * NESTBANDNUM= Nestling band number * Haptoglobin=haptoglobin levels at day 28 (mg/ml) * Mean TE= Mean number of total elements in that male's song * TE (z)= Z-transformed total elements * Mean UE=Mean number of unique elements in the song * UE (z)= z-transformed unique elements * mean phrases= Mean number of song phrases * Phrases (z)= z-transformed song phrases * Mean D= Mean song duration in seconds * D (z)=z-transformed song duration * B2 standard=beak brightness standardized so that lower values reflect less bright beaks * B2 (z)=z-transformed brightness * S1R standard= beak saturation at high wavelengths standardized so that lower values reflect less red beaks * S1R (z)=z-transformed S1R * S1U standard= beak saturation at low wavelengths standardized so that lower values reflect less red beaks * S1U (z)=z-transformed S1U * H4B standard= beak hue standardized so that lower values reflect less red beaks * H4B (z)=z-transformed H4B Female choice metadata: * Control Bird=PBS denotes that all control males received phosphate buffered saline * Treatment Bird= Treatment the male received (keyhole limpet hemocyanin (KLH) or lipopolysaccharide (LPS)) * Beak Wipes Control=# of beak wipes the female performed when on the control male side * Beak Wipes Treatment=# of beak wipes the female performed when on the "treatment male" side * Hops Control=# of hops female performed when on the control male side * Hops Treatment=# of hops female performed when on the treatment male side * Time Spent Near Control=amount of time (sec) female spent on the control male side * Time Spent Near Treatment=amount of time (sec) the female spent on the treatment male side
keywords: early-life; stress; immune response; phenotypic correlation; sexual signal; zebra finch;birdsongs; acoustic signals; beak coloration; mate selection
published: 2016-06-23
 
This dataset contains hourly traffic estimates (speeds) for individual links of the New York City road network for the years 2010-2013, estimated from New York City Taxis.
keywords: traffic estimates; traffic conditions; New York City
published: 2017-03-07
 
This is a sample 5 minute video of an E coli bacterium swimming in a microfluidic chamber as well as some supplementary code files to be used with the Matlab code available at https://github.com/dfraebel/CellTracking
published: 2017-03-07
 
This is a sample 5 minute video of an E coli bacterium swimming in a microfluidic chamber, to be used with the Matlab code available at https://github.com/dfraebel/CellTracking
published: 2016-05-19
 
This dataset contains records of four years of taxi operations in New York City and includes 697,622,444 trips. Each trip records the pickup and drop-off dates, times, and coordinates, as well as the metered distance reported by the taximeter. The trip data also includes fields such as the taxi medallion number, fare amount, and tip amount. The dataset was obtained through a Freedom of Information Law request from the New York City Taxi and Limousine Commission. The files in this dataset are optimized for use with the ‘decompress.py’ script included in this dataset. This file has additional documentation and contact information that may be of help if you run into trouble accessing the content of the zip files.
keywords: taxi;transportation;New York City;GPS
published: 2016-06-23
 
This dataset was extracted from a set of metadata files harvested from the DataCite metadata store (https://search.datacite.org/ui) during December 2015. Metadata records for items with a resourceType of dataset were collected. 1,647,949 total records were collected. This dataset contains three files: 1) readme.txt: A readme file. 2) version-results.csv: A CSV file containing three columns: DOI, DOI prefix, and version text contents 3) version-counts.csv: A CSV file containing counts for unique version text content values.
keywords: datacite;metadata;version values;repository data
published: 2016-05-26
 
This data set includes survey responses collected during 2015 from academic libraries with library publishing services. Each institution responded to questions related to its use of user studies or information about readers in order to shape digital publication design, formats, and interfaces. Survey data was supplemented with institutional categories to facilitate comparison across institutional types.
keywords: academic libraries; publishing; user experience; user studies
published: 2017-03-08
 
This dataset includes early embryogenesis and post-embryonic development of Soybean cyst nematode.
keywords: Soybean cyst nematode; Embryogenesis; Post-embryonic development
published: 2017-03-02
 
This data was collected between 2004 and 2010 at White River National Wildlife Refuge (WRNWR) and Saint Francis National Forest (SF). It was collected as part of two master’s and one PhD project at Arkansas State University USA studying Swainson’s Warbler habitat use, survival, and body condition.
keywords: Swainson’s Warbler; Limnothlypis swainsonii; flooding; natural disturbance; apparent survival; body condition
published: 2017-02-28
 
Leesburg, VA to Indianapolis, Indiana: Sampling Rate: 0.1 Hz Total Travel Time: 31100007 ms or 518 minutes or 8.6 hours Distance Traveled: 570 miles via I-70 Number of Data Points: 3112 Device used: Samsung Galaxy S4 Date Recorded: 2017-01-15 Parameters Recorded: * ACCELEROMETER X (m/s²) * ACCELEROMETER Y (m/s²) * ACCELEROMETER Z (m/s²) * GRAVITY X (m/s²) * GRAVITY Y (m/s²) * GRAVITY Z (m/s²) * LINEAR ACCELERATION X (m/s²) * LINEAR ACCELERATION Y (m/s²) * LINEAR ACCELERATION Z (m/s²) * GYROSCOPE X (rad/s) * GYROSCOPE Y (rad/s) * GYROSCOPE Z (rad/s) * LIGHT (lux) * MAGNETIC FIELD X (microT) * MAGNETIC FIELD Y (microT) * MAGNETIC FIELD Z (microT) * ORIENTATION Z (azimuth °) * ORIENTATION X (pitch °) * ORIENTATION Y (roll °) * PROXIMITY (i) * ATMOSPHERIC PRESSURE (hPa) * Relative Humidity (%) * Temperature (F) * SOUND LEVEL (dB) * LOCATION Latitude * LOCATION Longitude * LOCATION Altitude (m) * LOCATION Altitude-google (m) * LOCATION Altitude-atmospheric pressure (m) * LOCATION Speed (kph) * LOCATION Accuracy (m) * LOCATION ORIENTATION (°) * Satellites in range * GPS NMEA * Time since start in ms * Current time in YYYY-MO-DD HH-MI-SS_SSS format Quality Notes: There are some things to note about the quality of this data set that you may want to consider while doing preprocessing. This dataset was taken continuously but had multiple stops to refuel (without the data recording ceasing). This can be removed by parsing out all data that has a speed of 0. The mount for this dataset was fairly stable (as can be seen by the consistent orientation angle throughout the dataset). It was mounted tightly between two seats in the back of the vehicle. Unfortunately, the frequency for this dataset was set fairly low at one per ten seconds.
keywords: smartphone; sensor; driving; accelerometer; gyroscope; magnetometer; gps; nmea; barometer; satellite; temperature; humidity
published: 2017-02-23
 
GBS data from diverse sorghum lines. Project funded by DOE, ARPA-E, and startup funds to PJ Brown.
published: 2017-05-01
 
Indianapolis Int'l Airport to Urbana: Sampling Rate: 2 Hz Total Travel Time: 5901534 ms or 98.4 minutes Number of Data Points: 11805 Distance Traveled: 124 miles via I-74 Device used: Samsung Galaxy S6 Date Recorded: 2016-11-27 Parameters Recorded: * ACCELEROMETER X (m/s²) * ACCELEROMETER Y (m/s²) * ACCELEROMETER Z (m/s²) * GRAVITY X (m/s²) * GRAVITY Y (m/s²) * GRAVITY Z (m/s²) * LINEAR ACCELERATION X (m/s²) * LINEAR ACCELERATION Y (m/s²) * LINEAR ACCELERATION Z (m/s²) * GYROSCOPE X (rad/s) * GYROSCOPE Y (rad/s) * GYROSCOPE Z (rad/s) * LIGHT (lux) * MAGNETIC FIELD X (microT) * MAGNETIC FIELD Y (microT) * MAGNETIC FIELD Z (microT) * ORIENTATION Z (azimuth °) * ORIENTATION X (pitch °) * ORIENTATION Y (roll °) * PROXIMITY (i) * ATMOSPHERIC PRESSURE (hPa) * SOUND LEVEL (dB) * LOCATION Latitude * LOCATION Longitude * LOCATION Altitude (m) * LOCATION Altitude-google (m) * LOCATION Altitude-atmospheric pressure (m) * LOCATION Speed (kph) * LOCATION Accuracy (m) * LOCATION ORIENTATION (°) * Satellites in range * GPS NMEA * Time since start in ms * Current time in YYYY-MO-DD HH-MI-SS_SSS format Quality Notes: There are some things to note about the quality of this data set that you may want to consider while doing preprocessing. This dataset was taken continuously as a single trip, no stop was made for gas along the way making this a very long continuous dataset. It starts in the parking lot of the Indianapolis International Airport and continues directly towards a gas station on Lincoln Avenue in Urbana, IL. There are a couple parts of the trip where the phones orientation had to be changed because my navigation cut out. These times are easy to account for based on Orientation X/Y/Z change. I would also advise cutting out the first couple hundred points or the points leading up to highway speed. The phone was mounted in the cupholder in the front seat of the car.
keywords: smartphone; sensor; driving; accelerometer; gyroscope; magnetometer; gps; nmea; barometer; satellite
published: 2017-02-21
 
GBS data from biparental sorghum populations provided by Dr. Bill Rooney, TAMU. Data produced and analyzed by Pradeep Hirannaiah to study recombination in sorghum. Funding for this study was provided by the Sorghum Checkoff.
published: 2017-02-21
 
GBS data from diverse sorghum lines. Project funded by DOE, ARPA-E, and startup funds to PJ Brown.
published: 2017-06-01
 
List of Chinese Students Receiving a Ph.D. in Chemistry between 1905 and 1964. Based on two books compiling doctoral dissertations by Chinese students in the United States. Includes disciplines; university; advisor; year degree awarded, birth and/or death date, dissertation title. Accompanies Chapter 5 : History of the Modern Chemistry Doctoral Program in Mainland China by Vera V. Mainz published in "Igniting the Chemical Ring of Fire : Historical Evolution of the Chemical Communities in the Countries of the Pacific Rim", Seth Rasmussen, Editor. Published by World Scientific. Expected publication 2017.
keywords: Chinese; graduate student; dissertation; university; advisor; chemistry; engineering; materials science
published: 2017-01-06
 
This dataset includes early embryogenesis and post-embryonic development of Soybean cyst nematode.
keywords: Soybean cyst nematode; Embryogenesis; Post-embryonic development
published: 2017-06-01
 
List of Chinese Students Receiving a Ph.D. in Chemistry between 1905 and 1964. Based on two books compiling doctoral dissertations by Chinese students in the United States. Includes disciplines; university; advisor; year degree awarded, birth and/or death date, dissertation title. Accompanies Chapter 5 : History of the Modern Chemistry Doctoral Program in Mainland China by Vera V. Mainz published in "Igniting the Chemical Ring of Fire : Historical Evolution of the Chemical Communities in the Countries of the Pacific Rim", Seth Rasmussen, Editor. Published by World Scientific. Expected publication 2017 Chapter 5
keywords: Chinese; graduate student; dissertation; university; advisor; chemistry; engineering; materials science
published: 2016-12-20
 
Scripts and example data for AIDData (aiddata.org) processing in support of forthcoming Nakamura dissertation. This dataset includes two sets of scripts and example data files from an aiddata.org data dump. Fuller documentation about the functionality for these scripts is within the readme file. Additional background information and description of usage will be in the forthcoming Nakamura dissertation (link will be added when available). Data originally supplied by Nakamura. Python code and this readme file created by Wickes. Data included within this deposit are examples to demonstrate execution. Roughly, there are two python scripts in here: keyword_search.py, designed to assist in finding records matching specific keywords, and matching_tool.ipynb, designed to assist in detection of which records are and are not contained within a keyword results file and an aiddata project data file.
keywords: aiddata; natural resources
published: 2016-12-19
 
Files in this dataset represent an investigation into use of the Library mobile app Minrva during the months of May 2015 through December 2015. During this time interval 45,975 API hits were recorded by the Minrva web server. The dataset included herein is an analysis of the following: 1) a delineation of API hits to mobile app modules use in the Minrva app by month, 2) a general analysis of Minrva app downloads to module use, and 3) the annotated data file providing associations from API hits to specific modules used, organized by month (May 2015 – December 2015).
keywords: API analysis; log analysis; Minrva Mobile App
published: 2016-12-18
 
This dataset is the numerical simulation data of the computational study of the cold front-related hydrodynamics in the Wax Lake delta. The numerical model used is ECOM-si.
keywords: Wax Lake delta; Hydrodynamics; Cold front
published: 2016-12-13
 
BAM files for founding strain (MG1655-motile) as well as evolved strains from replicate motility selection experiments in low-viscosity agar plates containing either rich medium (LB) or minimal medium (M63+0.18mM galactose)
published: 2016-12-02
 
This dataset enumerates the number of geocoded tweets captured in geographic rectangular bounding boxes around the metropolitan statistical areas (MSAs) defined for 49 American cities, during a four-week period in 2012 (between April and June), through the Twitter Streaming API. More information on MSA definitions: https://www.census.gov/population/metro/
keywords: human dynamics; social media; urban informatics; pace of life; Twitter; ecological correlation; individual behavior
planned publication date: 2017-12-12
 
This dataset includes both meteorology and oceanography data collected at stations (CSI03, CSI06, and CSI09) near the Gulf of Mexico from the LSU WAVCIS (Waves-Current-Surge Information System) lab. The associated data analysis visualization is also saved in separate directories.
keywords: WAVCIS; Gulf of Mexico; Meteorology; Oceanography
published: 2016-12-12
 
This dataset is the field measurements of currents at two stations (Big Hogs Bayou and Delta1) in the the Wax Lake delta in November 2012 and February 2013.
keywords: Wax Lake delta; Currents
published: 2016-12-12
 
This dataset includes data of the the Wax Lake delta from four public agencies: NGDC, USGS, NDBC, and NOAA CO-OPS. Besides the original data, the processed data associated with analyzed figures are also shared.
keywords: Wax Lake delta; NOAA CO-OPS; NGDC; USGS; NDBC
published: 2016-12-12
 
This dataset is the field measurements of water depth at the Wax Lake delta on the date 2012-12-01.
keywords: Wax Lake delta; Bathymetry
published: 2016-12-12
 
This dataset is the field measurements of water depth at the Wax Lake delta conducted in late 2012.
keywords: Wax Lake delta; Bathymetry
published: 2016-12-12
 
This dataset is about a topographic LIDAR survey (saved in “waxlake-lidar.img”) that was conducted over the Wax Lake delta, between longitudes −91.5848 to −91.292 degrees, and latitudes 29.3647 to 29.6466 degrees. Different from other elevation data, the positive value in the LIDAR data indicates land elevation, while the zero value implies riverbed without identifying specific water depth.
keywords: LIDAR; Wax Lake delta
published: 2016-11-30
 
This is the dataset used in the BioScience publication of the same name. More information about this dataset: Interested parties can request data from the Critical Trends Assessment Program, which was the source for the data on natural areas in this study. More information on the program and data requests can be obtained by visiting the program webpage. Critical Trends Assessment Program, Illinois Natural History Survey. http://wwx.inhs.illinois.edu/research/ctap/ These spatial datasets were used for analyses: Illinois Natural History Survey. 2003. Illinois GAP analysis land cover classification 1999-2000, 1:100 000 Scale, Raster Digital Data, Version 2.0. Champaign, IL, USA. Illinois State Geological Survey. 1995. Illinois Landcover Thematic Map Coverage Map 1991-1995. Champaign, IL, USA. Illinois State Geological Survey. 2001. Illinois Landcover Thematic Map Coverage Map 1999-2000. Champaign, IL, USA. USDA National Agricultural Statistics Service Cropland Data Layer. 1999-2015. Published crop-specific data layer [Online]. Available at https://nassgeodata.gmu.edu/CropScape/. USDA-NASS, Washington, DC. Information on agricultural practices and landcover changes were derived from the following U.S. Department of Agriculture (USDA) resources: USDA Economic Research Service. 2016. Adoption of Genetically Engineered Crops in the U.S. Available at http://www.ers.usda.gov/data-products/. USDA-ERS, Washington, DC. USDA Natural Resources Conservation Service. 2015. Summary Report: 2012 National Resources Inventory. https://www.nrcs.usda.gov/Internet/FSE_DOCUMENTS/nrcseprd396218.pdf. USDA-NRCS, Washington, DC, and Center for Survey Statistics and Methodology, Iowa State University, Ames, Iowa.
keywords: Milkweed; Monarch Butterfly; CTAP Critical Trends Assessment Program; BioScience
published: 2016-11-28
 
These show the topography and relief of the Precambrian surface of the Cratonic Platform of the United States.
keywords: precambrian; geology; relief; elevation
published: 2016-06-06
 
These datasets represent first-time collaborations between first and last authors (with mutually exclusive publication histories) on papers with 2 to 5 authors in years [1988,2009] in PubMed. Each record of each dataset captures aspects of the similarity, nearness, and complementarity between two authors about the paper marking the formation of their collaboration.
published: 2016-08-16
 
This archive contains all the alignments and trees used in the HIPPI paper [1]. The pfam.tar archive contains the PFAM families used to build the HMMs and BLAST databases. The file structure is: ./X/Y/initial.fasttree ./X/Y/initial.fasta where X is a Pfam family, Y is the cross-fold set (0, 1, 2, or 3). Inside the folder are two files, initial.fasta which is the Pfam reference alignment with 1/4 of the seed alignment removed and initial.fasttree, the FastTree-2 ML tree estimated on the initial.fasta. The query.tar archive contains the query sequences for each cross-fold set. The associated query sequences for a cross-fold Y is labeled as query.Y.Z.fas, where Z is the fragment length (1, 0.5, or 0.25). The query files are found in the splits directory. [1] Nguyen, Nam-Phuong D, Mike Nute, Siavash Mirarab, and Tandy Warnow. (2016) HIPPI: Highly Accurate Protein Family Classification with Ensembles of HMMs. To appear in BMC Genomics.
keywords: HIPPI dataset; ensembles of profile Hidden Markov models; Pfam
published: 2016-08-18
 
Copyright Review Management System renewals by year, data from Table 2 of the article "How Large is the ‘Public Domain’? A comparative Analysis of Ringer’s 1961 Copyright Renewal Study and HathiTrust CRMS Data."
keywords: copyright; copyright renewals; HathiTrust
published: 2016-08-02
 
These data are the result of a multi-step process aimed at enriching BIBFRAME RDF with linked data. The process takes in an initial MARC XML file, transforms it to BIBFRAME RDF/XML, and then four separate python files corresponding to the BIBFRAME 1.0 model (Work, Instance, Annotation, and Authority) are run over the BIBFRAME RDF/XML output. The input and outputs of each step are included in this data set. Input file types include the CSV; MARC XML; and Master RDF/XML Files. The CSV contain bibliographic identifiers to e-books. From CSVs a set of MARC XML are generated. The MARC XML are utilized to produce the Master RDF file set. The major outputs of the enrichment code produce BIBFRAME linked data as Annotation RDF, Instance RDF, Work RDF, and Authority RDF.
keywords: BIBFRAME; Schema.org; linked data; discovery; MARC; MARCXML; RDF
published: 2016-07-22
 
Datasets and R scripts relating to the manuscript "Ecological characteristics and in situ genetic associations for yield-component traits of wild Miscanthus from eastern Russia" published in Annals of Botany, 10.1093/aob/mcw137. Field data, including collection locations, physical and ecological information for each location, and plant phenotypes relating to biomass are included. Genetic data in this repository include single nucleotide polymorphisms (SNPs) derived from restriction site-associated DNA sequencing (RAD-seq), as well as plastid microsatellites. A file is also included listing the DNA sequences of all RAD-seq markers generated to-date by the Sacks lab, including those from this publication.
keywords: Miscanthus sacchariflorus; Miscanthus sinensis; Russia; germplasm; RAD-seq; SNP
published: 2016-05-16
 
This dataset contains the protein sequences and trees used to compare NRPS condensation domains in the AMB gene cluster and was used to create figure S1 in Rojas et al. 2015. Instead of having to collect representative sequences independently, this set of condensation domain sequences may serve as a quick reference set for coarse classification of condensation domains.
keywords: condensation domain; NRPS; biosynthetic gene cluster; antimetabolite; Pseudomonas; oxyvinylglycine; secondary metabolite; thiotemplate; toxin
published: 2016-06-23
 
This dataset was extracted from a set of metadata files harvested from the DataCite metadata store (http://search.datacite.org/ui) during December 2015. Metadata records for items with a resourceType of dataset were collected. 1,647,949 total records were collected. This dataset contains four files: 1) readme.txt: a readme file. 2) language-results.csv: A CSV file containing three columns: DOI, DOI prefix, and language text contents 3) language-counts.csv: A CSV file containing counts for unique language text content values. 4) language-grouped-counts.txt: A text file containing the results of manually grouping these language codes.
keywords: datacite;metadata;language codes;repository data