Displaying 401 - 425 of 694 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2023-03-27
 
This dataset contains the full data used in the paper titled "Enabling High Precision Gradient Index Control in Subsurface Multiphoton Lithography," available at https://doi.org/10.1021/acsphotonics.2c01950 . The data used for Table 1 can be found in the dataset for the related Figure 8. Some supplemental figures' data can be found in the main figures data: Figure S2's data is contained in Figure 6. Figure S4 and Table S1 data is derived from Figure 6. Figure S9 is derived from Figure 7. Figure S10 is contained in Figure 7. Figure S12 is derived from Figure 6 and the Python code prism-fringe-analysis. Figures without a data file named after them do not have any data affiliated with them and are purely graphical representations.
published: 2020-05-31
 
This repository includes a simulated dataset and related scripts used for the paper "Moss: Accurate Single-Nucleotide Variant Calling from Multiple Bulk DNA Tumor Samples".
keywords: Somatic Mutations; Bulk DNA Sequencing; Cancer Genomics
published: 2020-04-20
 
Supplemental data sets for the Manuscript entitled "Contribution of fungal and invertebrate communities to mass loss and wood depolymerization in tropical terrestrial and aquatic habitats"
keywords: Coiba Island; wood decomposition; cellulose; hemicellulose; lignin breakdown; aquatic fungi
published: 2020-06-19
 
This dataset include data pulled from the World Bank 2009, the World Values Survey wave 6, Transparency International from 2009. The data were used to measure perceptions of expertise from individuals in nations that are recipients of development aid as measured by the World Bank.
keywords: World Values Survey; World Bank; expertise; development
published: 2024-03-09
 
Hype - PubMed dataset Prepared by Apratim Mishra This dataset captures ‘Hype’ within biomedical abstracts sourced from PubMed. The selection chosen is ‘journal articles’ written in English, published between 1975 and 2019, totaling ~5.2 million. The classification relies on the presence of specific candidate ‘hype words’ and their abstract location. Therefore, each article might have multiple instances in the dataset due to the presence of multiple hype words in different abstract sentences. The candidate hype words are 36 in count: 'major', 'novel', 'central', 'critical', 'essential', 'strongly', 'unique', 'promising', 'markedly', 'excellent', 'crucial', 'robust', 'importantly', 'prominent', 'dramatically', 'favorable', 'vital', 'surprisingly', 'remarkably', 'remarkable', 'definitive', 'pivotal', 'innovative', 'supportive', 'encouraging', 'unprecedented', 'bright', 'enormous', 'exceptional', 'outstanding', 'noteworthy', 'creative', 'assuring', 'reassuring', 'spectacular', and 'hopeful'. File 1: hype_dataset.csv Primary dataset. It has the following columns: 1. PMID: represents unique article ID in PubMed 2. Hype_word: Candidate hype word, such as ‘novel.’ 3. Sentence: Sentence in abstract containing the hype word. 4. Abstract_length: Length of article abstract. 5. Hype_percentile: Abstract relative position of hype word. 6. Hype_value: Propensity of hype based on the hype word, the sentence, and the abstract location. 7. Introduction: The ‘I’ component of the hype word based on IMRaD 8. Methods: The ‘M’ component of the hype word based on IMRaD 9. Results: The ‘R’ component of the hype word based on IMRaD 10. Discussion: The ‘D’ component of the hype word based on IMRaD File 2: hype_removed_phrases.csv Secondary dataset with same columns as File 1. Hype in the primary dataset is based on excluding certain phrases that are rarely hype. The phrases that were removed are included in File 2 and modeled separately. Removed phrases: 1. Major: histocompatibility, component, protein, metabolite, complex, surgery 2. Novel: assay, mutation, antagonist, inhibitor, algorithm, technique, series, method, hybrid 3. Central: catheters, system, design, composite, catheter, pressure, thickness, compartment 4. Critical: compartment, micelle, temperature, incident, solution, ischemia, concentration 5. Essential: medium, features, properties, opportunities 6. Unique: model, amino 7. Robust: regression 8. Vital: capacity, signs, organs, status, structures, staining, rates, cells, information 9. Outstanding: questions, issues, question, challenge, problems, problem, remains 10. Remarkable: properties 11. Definite: radiotherapy, surgery 12. Bright: field
keywords: Hype; PubMed; Abstracts; Biomedicine
published: 2020-05-17
 
Models and predictions for submission to TRAC - 2020 Second Workshop on Trolling, Aggression and Cyberbullying Our approach is described in our paper titled: Mishra, Sudhanshu, Shivangi Prasad, and Shubhanshu Mishra. 2020. “Multilingual Joint Fine-Tuning of Transformer Models for Identifying Trolling, Aggression and Cyberbullying at TRAC 2020.” In Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying (TRAC-2020). The source code for training this model and more details can be found on our code repository: https://github.com/socialmediaie/TRAC2020 NOTE: These models are retrained for uploading here after our submission so the evaluation measures may be slightly different from the ones reported in the paper.
keywords: Social Media; Trolling; Aggression; Cyberbullying; text classification; natural language processing; deep learning; open source;
published: 2023-06-01
 
This dataset contains four real-world sub-datasets with data embedded into Poincare ball models, including Olsson's single-cell RNA expression data, CIFAR10, Fashion-MNIST and mini-ImageNet. Each sub-dataset has two corresponding files: one is the data file, the other one is the pre-computed reference points for each class in the sub-dataset. Please refer to our paper (https://arxiv.org/pdf/2109.03781.pdf) and codes (https://github.com/thupchnsky/PoincareLinearClassification) for more details.
keywords: Hyperbolic space; Machine learning; Poincare ball models; Perceptron algorithm; Support vector machine
published: 2022-08-31
 
This dataset includes data on soil properties, soil N pools, and soil N fluxes presented in the manuscript, "Refining the role of nitrogen mineralization in mycorrhizal nutrient syndromes". Please refer to that publication for details about methodologies used to generate these data and for the experimental design. For this verison 2, we added specific gross nitrogen mineralization rates (ugN/gOM/d), microbial biomass carbon (ugC/gdw), microbial biomass nitrogen (ugN/gdw) and microbial biomass C:N ratios to the newest version of the data set. Additionally, we updated values for gross nitrogen mineralization, microbial NO3 assimilation and microbial NH4 assimilation to reflect slight changes in data processing. Those changes are reflected in "220829_All data_repository.csv". "220829_nitrogen_mineralization_readme.txt " is updated readme for the new file. The other 2 files begin with “220426_” are older version and same as in V1.
keywords: Nitrogen cycling; Ectomycorrhizal fungi; Arbuscular mycorrhizal fungi; Nitrogen fertilization; Gross mineralization
published: 2023-07-01
 
This is the data used in the paper "Assessment of spatiotemporal flood risk due to compound precipitation extremes across the contiguous United States". Code from the Github repository https://github.com/adtonks/precip_extremes can be used with the data here to reproduce the paper's results. v1.0.0 of the code is also archived at https://doi.org/10.5281/zenodo.8104252 This dataset is derived from NOAA-CIRES-DOE 20th Century Reanalysis V3. The NOAA-CIRES-DOE Twentieth Century Reanalysis Project version 3 used resources of the National Energy Research Scientific Computing Center managed by Lawrence Berkeley National Laboratory which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 and used resources of NOAA's Remotely Deployed High Performance Computing Systems.
keywords: spatiotemporal; CONUS; United States; precipitation; extremes; flooding
published: 2022-05-20
 
This dataset includes images and annotated counts for 150 airborne pollen samples from the Center for Tropical Forest Science 50 ha forest dynamics plot on Barro Colorado Island, Panama. Samples were collected once a year from April 1994 to June 2010.
keywords: aerial pollen traps; automated pollen identification; Barro Colorado Island; convolutional neural networks; Neotropics; palynology; phenology
published: 2020-08-01
 
This data set shows how density effects have an important influence on mixing at a small river confluence. The data consist of results of simulations using a detached eddy simulation model.
keywords: confluence; flow dynamics; density effects
published: 2011-09-20
 
This page provides the data for SuperFine, DACTAL, and BeeTLe publications. - Swenson, M. Shel, et al. "SuperFine: fast and accurate supertree estimation." Systematic biology 61.2 (2012): 214. - Nguyen, Nam, Siavash Mirarab, and Tandy Warnow. "MRL and SuperFine+ MRL: new supertree methods." Algorithms for Molecular Biology 7 (2012): 1-13. - Neves, Diogo Telmo, et al. "Parallelizing superfine." Proceedings of the 27th Annual ACM Symposium on Applied Computing. 2012. - Nelesen, Serita, et al. "DACTAL: divide-and-conquer trees (almost) without alignments." Bioinformatics 28.12 (2012): i274-i282. - Liu, Kevin, and Tandy Warnow. "Treelength optimization for phylogeny estimation." PLoS One 7.3 (2012): e33104.
published: 2019-12-20
 
This dynamic photosynthesis model of soybean canopy is developed by Yu Wang (yuwangcn@illinois.edu), IGB, University of Illinois. If you want to know more details, please check the following publication Yu Wang, Steven J. Burgess, Elsa de Becker, Stephen P. Long. Photosynthesis in the fleeting shadows: An overlooked opportunity for increasing crop productivity? The Plant Journal.
keywords: Matlab; Soybean canopy; photosynthesis model
published: 2020-03-13
 
Data files associated with the assembly of mitochondrial minicircles from five species of parasitic lice. This includes data from four species in the genus Columbicola and from the human louse (Pediculus humanus). The files include FASTA sequences for all five species, reference sequences for read mapping approaches, resulting contigs produced by various assembly approaches, and alignments of human louse minicircles mapped to published sequences of the same species.
keywords: mitochondria; FASTA; nucleotide sequences; alignment; Columbicola; Pediculus
published: 2021-09-06
 
Airglow images and Meteor radar data used in the paper "Mesospheric gravity wave activity estimated via airglow imagery, multistatic meteor radar, and SABER data taken during the SIMONe–2018 campaign".
keywords: airglow; meteor radar; gravity waves; momentum flux;
published: 2021-10-15
 
This is the 5 states 5000 cells synthetic expression file we used for validation of SimiC, a single cell gene regulatory network inference method with similarity constraints. Ground truth GRNs are stored in Numpy array format, and expression profiles of all states combined are stored in Pandas DataFrame in format of Pickle files.
keywords: Numpy array; GRNs; Pandas DataFrame;
published: 2016-05-16
 
This dataset contains the protein sequences and trees used to compare Non-Ribosomal Peptide Synthetase (NRPS) condensation domains in the AMB gene cluster and was used to create figure S1 in Rojas et al. 2015. Instead of having to collect representative sequences independently, this set of condensation domain sequences may serve as a quick reference set for coarse classification of condensation domains.
keywords: NRPS; biosynthetic gene cluster; antimetabolite; Pseudomonas; oxyvinylglycine; secondary metabolite; thiotemplate; toxin
published: 2020-08-25
 
The Allan Lab has published a Fluidigm pipeline online. This is the url: https://github.com/HPCBio/allan-fluidigm-pipeline. This url includes a tutorial for running the pipeline. However it does not have test datasets yet. This tarball hosted at the Illinois Data Bank is the dataset that completes the github tutorial. It includes inputs (custom database of tick pathogens and fluidigm raw reads) and output files (tables of samples with taxonomic classifications).
keywords: custom database of tick pathogens; fluidigm pipeline; fluidigm paired reads; fluidigm tutorial
published: 2019-09-17
 
BAM files for evolved strains from migration rate selection experiments conducted in low viscosity (0.2% w/v) agar plates containing M63 minimal medium with 1mM of mannose, melibiose, N-acetylglucosamine or galactose
published: 2023-01-12
 
These processing and Pearson correlational scripts were developed to support the study that examined the correlational relationships between local journal authorship, local and external citation counts, full-text downloads, link-resolver clicks, and four global journal impact factor indices within an all-disciplines journal collection of 12,200 titles and six subject subsets at the University of Illinois at Urbana-Champaign (UIUC) Library. This study shows strong correlations in the all-disciplines set and most subject subsets. Special processing scripts and web site dashboards were created, including Pearson correlational analysis scripts for reading values from relational databases and displaying tabular results. The raw data used in this analysis, in the form of relational database tables with multiple columns, is available at <a href="https://doi.org/10.13012/B2IDB-6810203_V1">https://doi.org/10.13012/B2IDB-6810203_V1</a>.
keywords: Pearson Correlation Analysis Scripts; Journal Publication; Citation and Usage Data; University of Illinois at Urbana-Champaign Scholarly Communication
published: 2022-09-29
 
3DIFICE: 3-dimensional Damage Imposed on Frame structures for Investigating Computer vision-based Evaluation methods This dataset contains 1,396 synthetic images and label maps with various types of earthquake damage imposed on reinforced concrete frame structures. Damage includes: cracking, spalling, exposed transverse rebar, and exposed longitudinal rebar. Each image has an associated label map that can be used for training machine learning algorithms to recognize the various types of damage.
keywords: computer vision; earthquake engineering; structural health monitoring; civil engineering; structural engineering;
published: 2019-09-25
 
<sup>12</sup>CO and <sup>13</sup>CO maps for six molecular clouds in the Large Magellanic Cloud, obtained with the Atacama Large Millimeter/submillimeter Array (ALMA). See the associated article in the Astrophysical Journal, and README files within each ZIP archive. Please cite the article if you use these data.
keywords: Radio astronomy
published: 2018-06-20
 
The dataset includes the data used in the study of Classical Topological Order in the Kinetics of Artificial Spin Ice. This includes the photoemission electron microscopy intensity measurement of artificial spin ice at different temperatures as a function of time. The data includes the raw data, the metadata, and the data cookbook. Please refer to the data cookbook for more information. Note: vertex_population.xlsx file in the meta_data_code folder can be disregarded.
keywords: artificial spin ice; PEEM; topological order
published: 2019-05-20
 
This is the experimental data of tetris artificial spin ice. The islands are made of Permalloy materials with size of 170 nm by 470 nm by 2.5 nm. The systems are measured at a temperature where the islands are fluctuating around room temperature. The data is recorded as photoemission electron microscopy intensity. More details about the dataset can be found in the file Note.txt and Tetris_data_list.xlsx Note: 2 files name bl11_teris600_033 and bl11_tetris600_2_135 are not recorded in the excel sheet because they are corrupted during the measurement. Any data that is not recorded in the excel sheet is either corrupted or of low quality. From files *_028 to *_049, tetris is spelled with “t” while in the raw data folder without “t”. This is a typo. Throughout the dataset, tetris and teris are supposed to have the same meaning.
keywords: artificial spin ice