Displaying 476 - 500 of 738 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2020-08-22
 
We are releasing the tracing dataset of four microservice benchmarks deployed on our dedicated Kubernetes cluster consisting of 15 heterogeneous nodes. The dataset is not sampled and is from selected types of requests in each benchmark, i.e., compose-posts in the social network application, compose-reviews in the media service application, book-rooms in the hotel reservation application, and reserve-tickets in the train ticket booking application. The four microservice applications come from [DeathStarBench](https://github.com/delimitrou/DeathStarBench) and [Train-Ticket](https://github.com/FudanSELab/train-ticket). The performance anomaly injector is from [FIRM](https://gitlab.engr.illinois.edu/DEPEND/firm.git). The dataset was preprocessed from the raw data generated in FIRM's tracing system. The dataset is separated by on which microservice component is the performance anomaly located (as the file name suggests). Each dataset is in CSV format and fields are separated by commas. Each line consists of the tracing ID and the duration (in 10^(-3) ms) of each component. Execution paths are specified in `execution_paths.txt` in each directory.
keywords: Microservices; Tracing; Performance
published: 2020-10-16
 
Video footage of an Eastern Box Turtle (Terrapene carolina carolina) partially predating a Field Sparrow nest (Spizella pusilla) at 0845 h on the 31 of May 2020. Please note that the date on the video footage is incorrect due to user error, but the time is correct.
keywords: nest predation; turtle; songbird; nest camera; Terrapene carolina carolina; Spizella pusilla;
published: 2020-12-30
 
High-speed X-ray videos of four E. abruptus specimens recorded at the Advanced Photron Source (Argonne National lab) in the Summer of 2018 and corresponding position data of landmarks tracked during the motion. See readme file for more details.
published: 2020-12-31
 
This dataset contains the amino acid and nucleotide alignments corresponding to the phylogenetic analyses of South et al. 2020 in Systematic Entomology. This dataset also includes the gene trees that were used as input for coalescent analysis in ASTRAL.
keywords: Plecoptera; stoneflies; phylogeny; insects
published: 2020-11-25
 
Video recorded by Louise Barker using a Cannon Powershot camera documents late-season combat behavior in Agkistrodon contortrix. Recorded in Beaufort County, North Carolina, 11.1 km SE of downtown Washington on 21 October 2020.
keywords: Agkistrodon contortrix; combat; mating; reproduction; copperhead; pit viper; Viperidae;
published: 2020-12-15
 
The dataset consists of results and various input data that are used in the GAMS model for the publication "Repeal of the Clean Power Plan: Social Cost and Distributional Implications". All the data are either excel files or in the .inc format which can be read within GAMS or Notepad. Main data sources include: agriculture, transportation and electricity data. Model details can be found in the paper and the GAMS model package.
keywords: carbon abatement; welfare cost; electricity sector; partial equilibrium model
published: 2021-01-23
 
Data sets from "Comparing Methods for Species Tree Estimation With Gene Duplication and Loss." It contains data simulated with gene duplication and loss under a variety of different conditions.
keywords: gene duplication and loss; species-tree inference;
published: 2019-10-23
 
Raw MD simulation trajectory, input and configuration files, SEM current data, and experimental raw data accompanying the publication, "Electrical recognition of the twenty proteinogenic amino acids using an aerolysin nanopore". README.md contains a description of all associated files.
keywords: molecular dynamics; protein sequencing; aerolysin; nanopore sequencing
published: 2019-10-05
 
This dataset contains collected and aggregated network information from NCSA’s Blue Waters system, which is comprised of 27,648 nodes connected via Cray Gemini* 3D torus (dimension 24x24x24) interconnect, from Jan/01/2017 to May/31/2017. Network performance counters for links are exposed via Cray's gpcdr (<a href="https://github.com/ovis-hpc/ovis/wiki/gpcdr-kernel-module">https://github.com/ovis-hpc/ovis/wiki/gpcdr-kernel-module</a>) kernel module. Lightweight Distributed Metric Service ([LDMS](<a href="https://github.com/ovis-hpc/ovis">https://github.com/ovis-hpc/ovis</a>)) is used to sampled the performance counters at 60 second intervals. Please read "README.md" file. <b>Acknowledgement:</b> This dataset is collected as a part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation and the state of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications.
keywords: HPC; Interconnect; Network; Congestion; Blue Waters; Dataset
published: 2021-11-19
 
This is a general description of the datasets included in this upload; details of each dataset can be found in the individual README.txt in each compressed folder. We have: 1. ROSE-HF.tar.gz 2. ROSE-LF.tar.gz HF (high fragmentary): 50% of the sequences are made fragmentary, which have average lengths of 25% of the original lengths with a standard deviation of 60 bp. LF (low fragmentary): 25% of the sequences are made fragmentary, which have average lengths of 50% of the original lengths with a standard deviation of 60 bp. The seven ROSE datasets made fragmentary are: 1000L1, 1000L3, 1000L4, 1000M3, 1000S1, 1000S2 and 1000S4. "ROSE-HF.tar.gz" contains HF versions of the seven ROSE datasets. "ROSE-LF.tar.gz" contains LF versions of the seven ROSE datasets.
keywords: ROSE; simulation; fragmentary
published: 2022-03-20
 
Data for "Generic character of charge and spin density waves in superconducting cuprates". - Neutron scattering data for SDW - RSXS scans of CDW of LESCO x=0.10, 0.125, 0.15, 0.17, 0.20 at various temperatures. - Temperature dependence of CDW peak intensity, correlation length, Qcdw (Lorentzian fit, S(q,T) fit, Landau-Ginzburg fit) - XAS data of LESCO x=0.10, 0.125, 0.15, 0.17, 0.20
published: 2020-09-18
 
Restriction site-associated DNA sequencing (RAD-seq) data from 643 Miscanthus accessions from a diversity panel, including 613 Miscanthus sacchariflorus, three M. sinensis, and 27 M. xgiganteus. DNA was digested with PstI and MspI, and single-end Illumina sequencing was performed adjacent to the PstI site. Variant and genotype calling was performed with TASSEL-GBSv2, using the Miscanthus sinensis v7.1 reference genome from Phytozome 12 (https://phytozome.jgi.doe.gov). Additional ploidy-aware genotype calling was performed by polyRAD v1.1.
keywords: variant call format (VCF); genotyping-by-sequencing (GBS); single nucleotide polymorphism (SNP); grass; genetic diversity; biomass
published: 2020-08-01
 
The Empoascini_morph_data.nex text file contains the original data used in the phylogenetic analyses of Xu et al. (Systematic Entomology, in review). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first nine lines of the file indicate the file type (Nexus), that 110 taxa were analyzed, that a total of 99 characters were analyzed, the format of the data, and specification for symbols used in the dataset to indicate different character states. For species that have more than one state for a particular character, the states are enclosed in square brackets. Question marks represent missing data.The pdf file, Appendix1.pdf, is available here and describes the morphological characters and character states that were scored in the dataset. The data analyses are described in the cited original paper.
keywords: Hemiptera; Cicadellidae; morphology; biogeography; evolution
published: 2021-02-28
 
This dataset contains the RegCM4 simulations used in the article " Implementation of dynamic ageing of carbonaceous aerosols in regional climate model RegCM". This dataset was used to investigate the impact of a new aging parameterisation scheme implemented in a regional climate model RegCM4. The dataset contains two sets of simulations: Expt_fix and Expt_dyn. It consists of the seasonal mean and daily mean values of the variables that were used to create the visualizations of this study. The Expt_fix and Expt_dyn dataset contain 34 and 38 NetCDF files, respectively. The CERES_vs_2expts_new.mat file is the comparison between CERES shortwave downward flux at the surface and same model outputs from two experiments for clear sky and all sky conditions. -------------------------------------------------- The following information about the dataset was generated on 2021-01-08 by SUDIPTA GHOSH <b>GENERAL INFORMATION</b> <i>1. Date of data collection (single date, range, approximate date):</i> 2019-01-01 to 2019-12-31 <i>2. Geographic location of data collection:</i> Urbana-Champaign,Illinois, USA <i>3. Information about funding sources that supported the collection of the data:</i> This work is supported by the MoEFCC under the NCAP-COALESCE project [Grant No. 14/10/2014-CC]. The first author acknowledges DST-INSPIRE fellowship [IF150055] and Fulbright-Kalam Climate Doctoral fellowship. N. R. acknowledges funding from NSF AGS-1254428 and DOE grant DE-SC0019192. Department of Science and Technology, Funds for Improvement of Science and Technology infrastructure in universities and higher educational institutions (DST-FIST) grant (SR/FST/ESII-016/2014) are acknowledged for the computing support. <b>DATA & FILE OVERVIEW</b> <i>1. File List:</i> Expt_fix and Expt_dyn datasets contain the analysed seasonal means and daily means of the variables that have been used to create the visualizations of this study. Each of the Expt_fix and Expt_dyn datasets contains 34 and 38 NetCDF files, respectively. <i>2. Relationship between files, if important:</i> NA <i>3. Additional related data collected that was not included in the current data package:</i> No <b>METHODOLOGICAL INFORMATION</b> <i>1. Description of methods used for collection/generation of data: </i> The model RegCM4 code is freely available online from <a href="http://gforge.ictp.it/gf/project/regcm/">http://gforge.ictp.it/gf/project/regcm/</a>. The anthropogenic aerosol emissions considered for the simulations are taken from IIASA inventory. The data used can be easily accessed online <a href="http://clima-dods.ictp.it/regcm4/">http://clima-dods.ictp.it/regcm4/</a> website. TRMM observed precipitation data can be assessed from <a href="https://giovanni.gsfc.nasa.gov/giovanni/">https://giovanni.gsfc.nasa.gov/giovanni/</a> website. CRU temperature data is available at <a href="https://crudata.uea.ac.uk/cru/data/hrg/">https://crudata.uea.ac.uk/cru/data/hrg/</a>. CERES satellite surface shortwave downward fluxes are available at <a href="https://ceres.larc.nasa.gov/data/">https://ceres.larc.nasa.gov/data/</a> website. Input files for the RegCM4 model are archived in <a href="http://clima-dods.ictp.it/regcm4/">http://clima-dods.ictp.it/regcm4/</a> website. This dataset contains the RegCM4 simulations used in the article " Implementation of dynamic ageing of carbonaceous aerosols in regional climate model RegCM ". Two sets of simulations: Expt_fix and Expt_dyn consists of the output data . This dataset only contains the analysed seasonal mean and daily mean of the variables that have been used to create the visualizations of this study. Each of Expt_fix and Expt_dyn contains 34 and 38 NetCDF files respectively. This dataset was used to investigate the impact of a new aging parameterisation scheme implemented in a regional climate model RegCM4. <i>2. Methods for processing the data:</i> Seasonal Mean and daily average values were extracted from 6-hourly model output. <i>3. Instrument- or software-specific information needed to interpret the data:</i> CDO-1.7.1, Grads-2.0.a9, Matlab2016b <i>4. Standards and calibration information, if appropriate:</i> NA <i>5. Environmental/experimental conditions:</i> NA <i>6. Describe any quality-assurance procedures performed on the data:</i> NA <i>7. People involved with sample collection, processing, analysis and/or submission:</i> Sudipta Ghosh, Nicole Riemer, Graziano Giuliani, Filippo Giorgi, Dilip Ganguly, Sagnik Dey <b>DATA-SPECIFIC INFORMATION FOR: Expt_fix_data.tar.gz</b> <i>1. Number of variables:</i> 29 <i>2. Number of cases/rows:</i> NA <i>3. Variable List:</i> Mass concentration (Kg m-3) of BC, BC_HB, BC_HL, OC, OC_HB, OC_HL; Columnar burden (mg m-2)] of BC, BC_HL, BC_HB, OC; Dry deposition flux (mg m-2 day-1) of BC_HB, BC_HL, OC_HB, OC_HL; Wet deposition flux due washout (mg m-2 day-1) of BC_HB, BC_HL, OC_HB, OC_HL; Wet deposition flux due to rainout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; AOD (unit less), precipitation (Kg m-2 s-1), temperature (K) , v-wind (m s-1), u-wind (m s-1), Surface shortwave downward flux (W m-2), Shortwave radiative forcing at the surface and top of atmosphere (W m-2) <b>DATA-SPECIFIC INFORMATION FOR: Expt_dyn_data.tar.gz</b> <i>1. Number of variables:</i> 30 <i>2. Number of cases/rows:</i> NA <i>3. Variable List:</i> Mass concentration (Kg m-3) of BC, BC_HB, BC_HL, OC, OC_HB, OC_HL; Columnar burden (mg m-2)] of BC, BC_HL, BC_HB, OC; Dry deposition flux (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; Wet deposition flux due washout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; Wet deposition flux due to rainout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; AOD (unit less); precipitation (Kg m-2 s-1); temperature (K); v-wind (m s-1); u-wind (m s-1); Surface shortwave downward flux (W m-2); Shortwave radiative forcing at the surface and top of atmosphere (W m-2); ageingscale (s-1) <b>DATA-SPECIFIC INFORMATION FOR: CERES_vs_2expts_new.mat</b> <i>1. Number of variables:</i> 12 <i>2. Number of cases/rows:</i> NA <i>3. Variable List:</i> Surface shortwave downward flux for clear sky (W/m-2) for CERES, Expt_fix, Expt_dyn (for winter JF and monsoon JJAS seasons); Surface shortwave downward flux for all sky conditions (W/m-2) for CERES, Expt_fix, Expt_dyn (for winter JF and monsoon JJAS seasons). <b>NOTE:</b> The following information applies for all three (3) files: <i> Missing data codes:</i> NA <i>Specialized formats or other abbreviations used:</i> NA
keywords: Carbonaceous aerosols; ageing parameterisation scheme; regional climate model; NetCDF
published: 2021-08-05
 
This geodatabase serves two purposes: 1) to provide State of Illinois agencies with a fast resource for the preparation of maps and figures that require the use of shape or line files from federal agencies, the State of Illinois, or the City of Chicago, and 2) as a start for social scientists interested in exploring how geographic information systems (whether this is data visualization or geographically weighted regression) can bring new meaning to the interpretation of their data. All layer files included are relevant to the State of Illinois. Sources for this geodatabase include the U.S. Census Bureau, U.S. Geological Survey, City of Chicago, Chicago Public Schools, Chicago Transit Authority, Regional Transportation Authority, and Bureau of Transportation Statistics.
keywords: State of Illinois; City of Chicago; Chicago Public Schools; GIS; Statistical tabulation areas; hydrography
published: 2021-03-08
 
In a set of field studies across four years, the effect of self-shading on photosynthetic performance in lower canopy sorghum leaves was studied at sites in Champaign County, IL. Photosynthetic parameters in upper and lower canopy leaves, carbon assimilation, electron transport, stomatal conductance, and activity of three C4-specific photosynthetic enzymes, were compared within a genetically diverse range of accessions varying widely in canopy architecture and thereby in the degree of self-shading. Accessions with erect leaves and high light transmission through the canopy are henceforth referred to as ‘erectophile’ and those with low leaf erectness, ‘planophile’. In the final year of the study, bundle sheath leakiness in erectophile and planophile accessions was also compared.
keywords: Sorghum; Photosynethic Performance; Leaf Inclination
published: 2019-09-17
 
Trained models for multi-task multi-dataset learning for text classification as well as sequence tagging in tweets. Classification tasks include sentiment prediction, abusive content, sarcasm, and veridictality. Sequence tagging tasks include POS, NER, Chunking, and SuperSenseTagging. Models were trained using: <a href="https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_classification_tagging.py">https://github.com/socialmediaie/SocialMediaIE/blob/master/SocialMediaIE/scripts/multitask_multidataset_classification_tagging.py</a> See <a href="https://github.com/socialmediaie/SocialMediaIE">https://github.com/socialmediaie/SocialMediaIE</a> and <a href="https://socialmediaie.github.io">https://socialmediaie.github.io</a> for details. If you are using this data, please also cite the related article: Shubhanshu Mishra. 2019. Multi-dataset-multi-task Neural Sequence Tagging for Information Extraction from Tweets. In Proceedings of the 30th ACM Conference on Hypertext and Social Media (HT '19). ACM, New York, NY, USA, 283-284. DOI: https://doi.org/10.1145/3342220.3344929
keywords: twitter; deep learning; machine learning; trained models; multi-task learning; multi-dataset learning; classification; sequence tagging
published: 2020-08-19
 
This data set is a matrix of values. The element in the row "i" and the column "j" denotes the influence of hexagonal pyramidal distribution at node "i" on the node "j". The size of the matrix is 16641x16641. This matrix corresponds to a 129x129 grid. Influence coefficient matrix on a smaller grid can be obtained by appropriately choosing the elements from the bigger matrix.
keywords: Influence coefficients
published: 2024-03-01
 
This dataset contains model output from the Community Earth System Model, Version 1 (CESM1; Hurrell et al., 2013) and variables from the European Centre for Medium-Range Weather Forecast (ECMWF) Reanalysis v5 (ERA5; Hersbach et al., 2020). These data were used for analysis in “The location of large-scale soil moisture anomalies affects moisture transport and precipitation over southeastern South America”, published in Geophysical Research Letters. Acknowledgments: This work was supported by NSF Award AGS-1852709. We acknowledge high-performance computing support from Cheyenne (doi:10.5065/D6RX99HX) provided by NCAR's Computational and Information Systems Laboratory, sponsored by the NSF. We thank Dr. Haiyan Teng for providing guidance on setting up the CESM experiments and offering valuable advice. References: Hersbach H, Bell B, Berrisford P, et al. The ERA5 global reanalysis. Q J R Meteorol Soc. 2020; 146: 1999–2049. https://doi.org/10.1002/qj.3803 Hurrell, J. W., and Coauthors, 2013: The Community Earth System Model: A Framework for Collaborative Research. Bull. Amer. Meteor. Soc., 94, 1339–1360, https://doi.org/10.1175/BAMS-D-12-00121.1
keywords: atmospheric sciences; climate modeling; land-atmosphere interactions; soil moisture; regional atmospheric circulation; southeastern South America
published: 2020-07-15
 
This repository includes scripts and datasets for the paper, "Polynomial-Time Statistical Estimation of Species Trees under Gene Duplication and Loss."
keywords: Species tree estimation; gene duplication and loss; identifiability; statistical consistency; quartets; ASTRAL
published: 2020-05-31
 
This repository includes a simulated dataset and related scripts used for the paper "Moss: Accurate Single-Nucleotide Variant Calling from Multiple Bulk DNA Tumor Samples".
keywords: Somatic Mutations; Bulk DNA Sequencing; Cancer Genomics
published: 2020-04-20
 
Supplemental data sets for the Manuscript entitled "Contribution of fungal and invertebrate communities to mass loss and wood depolymerization in tropical terrestrial and aquatic habitats"
keywords: Coiba Island; wood decomposition; cellulose; hemicellulose; lignin breakdown; aquatic fungi
published: 2020-06-19
 
This dataset include data pulled from the World Bank 2009, the World Values Survey wave 6, Transparency International from 2009. The data were used to measure perceptions of expertise from individuals in nations that are recipients of development aid as measured by the World Bank.
keywords: World Values Survey; World Bank; expertise; development