published: 2020-01-20
This datasets provide basis of our analysis in the paper - Revising the Ozone Depletion Potentials for Short-Lived Chemicals such as CF3I and CH3I. All datasets here are from the model output (CAM4-chem). All the simulations (background and perturbation) were run to steady-state and only the last year outputs used in analysis are archived here.
keywords: Illinois Data Bank; NetCDF; Ozone Depletion Potential; CF3I and CH3I
planned publication date: 2020-12-12
Dataset associated with Jones et al FE-2019-01175 submission: Does the size and developmental stage of traits at fledging reflect juvenile flight ability among songbirds? Excel CSV files with all of the data used in analyses and file with descriptions of each column. The flight ability variable in this dataset was derived from fledgling drop tests, examples of which can be found in the related dataset: Jones, Todd M.; Benson, Thomas J.; Ward, Michael P. (2019): Flight Ability of Juvenile Songbirds at Fledgling: Examples of Fledgling Drop Tests. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2044905_V1.
keywords: body condition; fledgling; flight ability; locomotor ability; post-fledging; songbirds; wing development; wing emergence
published: 2020-01-08
These are abundance dynamics data and simulations for the paper "Higher-order interaction between species inhibits bacterial invasion of a phototroph-predator microbial community" .
keywords: Microbial community; Higher order interaction; Invasion; Algae; Bacteria; Ciliate
planned publication date: 2020-06-03
This dataset provides files for use in analysis of human land preference across Australasia, and in a localized analysis of land preference in Laos and Vietnam. All files can be imported into ArcGIS for visualization, and re-analyzed using the open source Maxent species distribution modeling program. CSV files contain known human presence sites for model validation. ASC files contain geographically coded environmental data for mean annual temperature and mean annual precipitation during the Last Glacial Maximum, as well as downward slope data. All ASC files are in the WGS 1984 Mercator map projection for visualization in ArcGIS and can be opened as text files in text editors supporting large file sizes.
keywords: human dispersal; ecological niche modeling; Australasia; Late Pleistocene; land preference
planned publication date: 2020-02-01
This data describes habitat use, availability, landscape level influences, and daily movement of dabbling ducks in the Wabash River Valley of southeastern Illinois and southwestern Indiana. It contains triangulated locations of individual ducks, associated habitat assignments of those locations, flood survey data to determine water availability, and randomly generated points to assess landscape level questions.
keywords: waterfowl; ducks; dabbling; mallard; teal; habitat
published: 2019-12-22
Dataset providing calculation of a Competition Index (CI) for Late Pleistocene carnivore guilds in Laos and Vietnam and their relationship to humans. Prey mass spectra, Prey focus masses, and prey class raw data can be used to calculate the CI following Hemmer (2004). Mass estimates were calculated for each species following Van Valkenburgh (1990). Full citations to methodological papers are included as relationships with other resources
keywords: competition; Southeast Asia; carnivores; humans
published: 2019-12-20
This dynamic photosynthesis model of soybean canopy is developed by Yu Wang (yuwangcn@illinois.edu), IGB, University of Illinois. If you want to know more details, please check the following publication Yu Wang, Steven J. Burgess, Elsa de Becker, Stephen P. Long. Photosynthesis in the fleeting shadows: An overlooked opportunity for increasing crop productivity? The Plant Journal.
keywords: Matlab; Soybean canopy; photosynthesis model
published: 2019-12-17
This dataset provides the raw data, code and related figures for the paper, "Channel Activation of CHSH Nonlocality"
keywords: Super-activation; Non-locality breaking channel
planned publication date: 2020-12-01
This is the data set from the published manuscript 'Vertebrate scavenger guild composition and utilization of carrion in an East Asian temperate forest' by Inagaki et al.
keywords: Japan;Sika Deer
published: 2019-12-12
This dataset contains gamma-ray spectra templates for a source interdiction and uranium enrichment measurement task. This dataset also contains Keras machine learning models trained using datasets created using these templates.
keywords: gamma-ray spectroscopy; neural networks; machine learning; isotope identification; uranium enrichment; sodium iodide; NaI(Tl)
published: 2019-12-10
The dataset consists of two types of data: the estimate of land productivity (the maximum productivity, MP) and the estimate of land that has low productivity for any major crops planted in the Contiguous United States and then may be available for growing bioenergy crops (the marginal land, ML). All data items are in GeoTiff format, under the World Geodetic System (WGS) 84 project, and with a resolution of 0.0020810045 degree (~250 m). The MP values are calculated based on machine learning model estimated yields of major crops in the CONUS, and its expected value (MP_mean.tif), and associated uncertainty (MP_IDP.tif). The ML availability data have two versions: a deterministic version and a version with uncertainty. The deterministic MLs are determined as the land pixels with expected MP values falling in the range defined in the following criteria, and the MLs with uncertainty are determined as the probability that the MP value of a land pixel falls in the range defined in the following criteria: Criteria_____Description S1________ Current crop and pasture land with MP <= P50 S2________ Current crop and pasture land with MP <= P25 S3________ S1 + current grass and shrub land with P25 < MP < P50 S4________ S2 + current grass and shrub land with P10 < MP < P25 Economic__ Current crop and pasture land with potential profitability < 0 Here P10, P25 and P50 are the 10th, 25th and 50th percentile of crop MP values
keywords: Land productivity;marginal land;land use
planned publication date: 2020-06-01
Dataset associated with Hoover et al AUK-19-093 submission: Local conspecific density does not influence reproductive output in a secondary cavity-nesting songbird. Excel CSV with all of the data used in analyses. Description of variables YEARS: year ORDINAL_DATE: number for what day of the year it is with 1 January = 1,……30 December = 365 SITE: acronym for each study site BOX: unique nest box identifier on each study site TREAT: designates whether nest box was in a high- or low- nest box density area within each study site ACTUAL_NO_NEIGHBORS: number of pairs of warblers using a nest box within 200 m of a given pair’s nest box CLUTCH_SIZE: number of warbler eggs in nest at the onset of incubation PROWN: number of warbler nestlings once eggs have hatched PROWF: number of warbler nestlings that fledged out of the nest box HATCH_SUCCESS: proportion of eggs in the nest that hatched FLEDG_SUCCESS: proportion of the nestlings that fledged from the nest box HATCH_SUCCESS2: binary category where “0” indicates there was some, and “1” indicates there was no hatching failure FLEDG_SUCCESS2: binary category where “0” indicates there was some, and “1” indicates there was no nestling failure (i.e. nestling death) BHCO_PARASIT2: binary category where “0” indicates no cowbird parasitism, and “1” indicates there was cowbird parasitism BHCOE: number of cowbird eggs in clutch BHCOF: number of cowbird nestlings that fledged from the nest PAIRID: unique number that identifies a male and female warbler that are together at a nest box and this number is the same in a subsequent nesting attempt or year if the same male and female are together again FEMALE_ID: unique identifier for each female which represents her leg band combination. Each letter represents a band with letters preceding the hyphen being on the right leg and after the hyphen the left leg FEM_AGE: binary category where “0” indicates a 1-year-old bird and “1” indicates a >1-year-old bird FEMALE_BREEDING_ATTEMPT: “1” indicates first, “2” indicates second,……..breeding attempt within a given year SECOND_ATTEMPT: for any female that fledged a brood in a given year, binary category where “0” represents that they did not, and “1” indicates that they did attempt a second brood that year F_TOT_PROWF: total reproductive output (number of warbler fledglings produced) for a given female in a given year MALE_ID: unique identifier for each male which represents his leg band combination. Each letter represents a band with letters preceding the hyphen being on the right leg and after the hyphen the left leg MALE_AGE2: binary category where “0” indicates a 1-year-old bird and “1” indicates a >1-year-old bird Provisioning_rate: total number of food provisions per nestling per hour by male and female warbler combined BROOD_MASS: average nestling mass (g) for the brood BROOD_TARSUS: average nestling tarsus length (mm) for the brood Brood_condition: unit-less index of nestling condition that uses the residuals of the BROOD_MASS/BROOD_TARSUS relationship A period (“.”) represents where data were not collected, not available, or because individual nest or female did not qualify for consideration of a category assignment. An empty cell represents no data available for this particular cell.
keywords: conspecific density; density dependence; food limitation; hatching success; nestling body condition; nestling provisioning; Prothonotary Warbler; reproductive output
published: 2019-12-10
The Cline Center Historical Phoenix Event Data covers the period 1945-2019 and includes several million events extracted from 18.9 million news stories. This data was produced using the state-of-the-art PETRARCH-2 software to analyze content from the New York Times (1945-2018), the BBC Monitoring's Summary of World Broadcasts (1979-2019) and the Central Intelligence Agency’s Foreign Broadcast Information Service (1995-2004). It documents the agents, locations, and issues at stake in a wide variety of conflict, cooperation and communicative events in the CAMEO ontology. The Cline Center produced this data with the generous support of Linowes Fellow and Faculty Affiliate Prof. Dov Cohen and help from our academic and private sector collaborators in the Open Event Data Alliance (OEDA).
keywords: OEDA; Open Event Data Alliance (OEDA); Cline Center; Cline Center for Advanced Social Research; civil unrest; petrarch; phoenix event data; violence; protest; political; social; political science
published: 2019-12-10
Data files associated with the assembly of mitochondrial minicircles from five species of parasitic lice. This includes data from four species in the genus Columbicola and from the human louse (Pediculus humanus). The files include FASTA sequences for all five species, reference sequences for read mapping approaches, resulting contigs produced by various assembly approaches, and alignments of human louse minicircles mapped to published sequences of the same species.
keywords: mitochondria; FASTA; nucleotide sequences; alignment; Columbicola; Pediculus
published: 2019-12-03
This is the data set associated with the manuscript titled "Extensive host-switching of avian feather lice following the Cretaceous-Paleogene mass extinction event." Included are the gene alignments used for phylogenetic analyses and the cophylogenetic input files.
keywords: phylogenomics, cophylogenetics, feather lice, birds
published: 2019-12-03
These are the alignments of transcriptome data used for the analysis of members of Heteroptera. This dataset is analyzed in "Deep instability in the phylogenetic backbone of Heteroptera is only partly overcome by transcriptome-based phylogenomics" published in Insect Systematics and Diversity.
keywords: Heteroptera; Hemiptera; Phylogenomics; transcriptome
planned publication date: 2020-04-22
Nest survival and Fledgling production data for Bell's Vireo and Willow Flycatcher nests.
keywords: Bell's Vireo;Willow Flycatcher;habitat selection;fitness;
published: 2019-11-18
VCF files used to analyze a novel filtering tool VEF, presented in the article "VEF: a Variant Filtering tool based on Ensemble methods".
keywords: VCF files; filtering; VEF
published: 2019-11-11
This repository includes scripts and datasets for the paper, "Polynomial-Time Statistical Estimation of Species Trees under Gene Duplication and Loss."
keywords: Species tree estimation; gene duplication and loss; identifiability; statistical consistency; ASTRAL
published: 2019-11-11
This repository includes scripts and datasets for the paper, "FastMulRFS: Statistically consistent polynomial time species tree estimation under gene duplication."
keywords: Species tree estimation; gene duplication and loss; statistical consistency; MulRF, FastRFS
published: 2019-10-27
This dataset accompanies the paper "STREETS: A Novel Camera Network Dataset for Traffic Flow" at Neural Information Processing Systems (NeurIPS) 2019. Included are: *Over four million still images form publicly accessible cameras in Lake County, IL. The images were collected across 2.5 months in 2018 and 2019. *Directed graphs describing the camera network structure in two communities in Lake County. *Documented non-recurring traffic incidents in Lake County coinciding with the 2018 data. *Traffic counts for each day of images in the dataset. These counts track the volume of traffic in each community. *Other annotations and files useful for computer vision systems. Refer to the accompanying "readme.txt" or "readme.pdf" for further details.
keywords: camera network; suburban vehicular traffic; roadways; computer vision
published: 2019-10-16
Human annotations of randomly selected judged documents from the AP 88-89, Robust 2004, WT10g, and GOV2 TREC collections. Seven annotators were asked to read documents in their entirety and then select up to ten terms they felt best represented the main topic(s) of the document. Terms were chosen from among a set sampled from the document in question and from related documents.
keywords: TREC; information retrieval; document topicality; document description
published: 2019-10-05
This dataset contains collected and aggregated network information from NCSA’s Blue Waters system, which is comprised of 27,648 nodes connected via Cray Gemini* 3D torus (dimension 24x24x24) interconnect, from Jan/01/2017 to May/31/2017. Network performance counters for links are exposed via Cray's gpcdr (<a href="https://github.com/ovis-hpc/ovis/wiki/gpcdr-kernel-module">https://github.com/ovis-hpc/ovis/wiki/gpcdr-kernel-module</a>) kernel module. Lightweight Distributed Metric Service ([LDMS](<a href="https://github.com/ovis-hpc/ovis">https://github.com/ovis-hpc/ovis</a>)) is used to sampled the performance counters at 60 second intervals. Please read "README.md" file. <b>Acknowledgement:</b> This dataset is collected as a part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation and the state of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications.
keywords: HPC; Interconnect; Network; Congestion; Blue Waters; Dataset
published: 2019-10-30
Data used for network analyses. deduplicated nodes.csv - List of all nodes in the network. Each node represents a paper. Column headings: ID, Authors, Title, Year Descriptions of column headings: ID - ID for the paper/node. IDs follow the following conventions: - A00 represents the retracted Matsuyama paper. - F### represents a first-generation citation that directly cites the retracted Matsuyama paper. - F###S### represents a second-generation citation that does not cite the retracted Matsuyama paper but that cites some first-generation citation (where F### is one of the first-generation articles it cites). Authors - Authors of the paper Title - Title of the paper (some in Unicode) Year - Year of publication of the paper; Either a 4-digit year or NA (which indicates that the first data source we got it from either did not provide a year, or provided a year that we deemed unreliable) NOTE: Authors/Title/Year were taken primarily from Google Scholar (since it had the larger number of items) with unique items from Web of Science added. ------- deduplicated edges.csv - List of all edges in the network. Each edge represents a citation between two papers. Column headings: from, to Descriptions of column headings: from - ID for the cited paper. This is what the citation goes FROM. to - ID for the citing paper. This is what the citation goes TO. NOTE: All IDs are from deduplicated nodes.csv and follow the conventions above. ------- nodesFG.txt - List of the IDs for the 135 first-generation citations, from 2005 (when Matsuyama was published) through 2018. ------- nodesSGnotFG.txt - List of the IDs for the 2559 second-generation citations that are not first-generation citations from 2005 (when Matsuyama was published), through 2018 -------
keywords: citations; retraction; network analysis; Web of Science; Google Scholar; indirect citation
published: 2019-11-12
We are sharing the tweet IDs of four social movements: #BlackLivesMatter, #WhiteLivesMatter, #AllLivesMatter, and #BlueLivesMatter movements. The tweets are collected between May 1st, 2015 and May 30, 2017. We eliminated the location to the United States and focused on extracting the original tweets, excluding the retweets. Recommended citations for the data: Rezapour, R. (2019). Data for: How do Moral Values Differ in Tweets on Social Movements?. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9614170_V1 and Rezapour, R., Ferronato, P., and Diesner, J. (2019). How do moral values differ in tweets on social movements?. In 2019 Computer Supported Cooperative Work and Social Computing Companion Publication (CSCW’19 Companion), Austin, TX.
keywords: Twitter; social movements; black lives matter; blue lives matter; all lives matter; white lives matter