Displaying 1 - 25 of 713 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2023-07-14
 
This dataset includes a total of 300 images of 45 extant species of Podocarpus (Podocarpaceae) and nine images of fossil specimens of the morphogenus Podocarpidites. The goal of this dataset is to capture the diversity of morphology within the genus and create an image database for training machine learning models. The images were taken using Airyscan confocal superresolution microscopy at 630x magnification (63x/NA 1.4 oil DIC). The images are in the CZI file format. They can be opened using Zeiss propriety software (Zen, Zen lite) or open microscopy software, such as ImageJ. More information on how to open CZI files can be found here: [https://www.zeiss.com/microscopy/us/products/software/zeiss-zen/czi-image-file-format.html] Please cite this dataset and listed publications when using these images.
keywords: optical superresolution microscopy; Zeiss Airyscan; CZI images; conifer; saccate pollen; Podocarpus; Podocarpidites; Smithsonian Tropical Research Institute
published: 2025-01-17
 
This is the data set for a publication titled, "Coupling carbon dioxide gas within a bubble curtain enhances its effectiveness to deter fish." The current study sought to quantify whether adding carbon dioxide gas (CO2) to a bubble curtain would enhance its efficacy to block fish. For this, a choice tank was outfitted with bubble curtains infused with either compressed air alone, or with two different concentrations of CO2 [30 or 100 mg/L]. Passage rates and position of common carp (an invasive Cyprinid) and black bullhead (a native Ictalurid) exposed to these treatments were compared. The data set consists of data from each of the experiments performed during the study.
keywords: invasive species; multimodal barriers; deterrents; biodiversity; species range; distribution
published: 2024-02-27
 
Coups d'Ètat are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d’État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e., realized, unrealized, or conspiracy) the type of actor(s) who initiated the coup (i.e., military, rebels, etc.), as well as the fate of the deposed leader. Version 2.1.3 adds 19 additional coup events to the data set, corrects the date of a coup in Tunisia, and reclassifies an attempted coup in Brazil in December 2022 to a conspiracy. Version 2.1.2 added 6 additional coup events that occurred in 2022 and updated the coding of an attempted coup event in Kazakhstan in January 2022. Version 2.1.1 corrected a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixed this omission by marking the case as both a dissident coup and an auto-coup. Version 2.1.0 added 36 cases to the data set and removed two cases from the v2.0.0 data. This update also added actor coding for 46 coup events and added executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event. Version 2.0.0 improved several aspects of the previous version (v1.0.0) and incorporated additional source material to include: • Reconciling missing event data • Removing events with irreconcilable event dates • Removing events with insufficient sourcing (each event needs at least two sources) • Removing events that were inaccurately coded as coup events • Removing variables that fell below the threshold of inter-coder reliability required by the project • Removing the spreadsheet ‘CoupInventory.xls’ because of inadequate attribution and citations in the event summaries • Extending the period covered from 1945-2005 to 1945-2019 • Adding events from Powell and Thyne’s Coup Data (Powell and Thyne, 2011) <br> <b>Items in this Dataset</b> 1. <i>Cline Center Coup d'État Codebook v.2.1.3 Codebook.pdf</i> - This 15-page document describes the Cline Center Coup d’État Project dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d'État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised February 2024</i> 2. <i>Coup Data v2.1.3.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 1000 observations. <i>Revised February 2024</i> 3. <i>Source Document v2.1.3.pdf</i> - This 325-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised February 2024</i> 4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in markdown language. <i>Revised February 2024</i> <br> <b> Citation Guidelines</b> 1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2024. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.3. February 27. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V7 2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access): Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Emilio Soto. 2024. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.3. February 27. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V7
published: 2025-01-06
 
The complete data for the publication "RNA helicase MOV10 suppresses fear memory and dendritic arborization and regulates microtubule dynamics in hippocampal neurons," excluding sequencing data deposited in GEO, is provided here.
keywords: MOV10; NUMA1; hippocampal neurons; behavior; cytoskeleton; tiff; czi; dv; mp4; mpg; ndpi; csv; xlsx; R
published: 2025-01-15
 
Data was generated from acoustic transmitters implanted in tournament caught and non-angled control largemouth bass across multiple seasons. This data was used to quantify post-release movement, behavior, and mortality in response to angling tournaments at different times of year and varying water temperatures.
published: 2023-05-02
 
Tab-separated value (TSV) file. 14745 data rows. Each data row represents publication metadata as retrieved from Crossref (http://crossref.org) 2023-04-05 when searching for retracted publications. Each row has the following columns: Index - Our index, starting with 0. DOI - Digital Object Identifier (DOI) for the publication Year - Publication year associated with the DOI. URL - Web location associated with the DOI. Title - Title associated with the DOI. May be blank. Author - Author(s) associated with the DOI. Journal - Publication venue (journal, conference, ...) associated with the DOI RetractionYear - Retraction Year associated with the DOI. May be blank. Category - One or more categories associated with the DOI. May be blank. Our search was via the Crossref REST API and searched for: Update_type=( 'retraction', 'Retraction', 'retracion', 'retration', 'partial_retraction', 'withdrawal','removal')
keywords: retraction; metadata; Crossref; RISRS
published: 2020-04-22
 
Data on Croatian restaurant allergen disclosures on restaurant websites, on-line menus and social media comments
keywords: restaurant; allergen; disclosure; tourism
published: 2016-05-19
 
This dataset contains records of four years of taxi operations in New York City and includes 697,622,444 trips. Each trip records the pickup and drop-off dates, times, and coordinates, as well as the metered distance reported by the taximeter. The trip data also includes fields such as the taxi medallion number, fare amount, and tip amount. The dataset was obtained through a Freedom of Information Law request from the New York City Taxi and Limousine Commission. The files in this dataset are optimized for use with the ‘decompress.py’ script included in this dataset. This file has additional documentation and contact information that may be of help if you run into trouble accessing the content of the zip files.
keywords: taxi;transportation;New York City;GPS
published: 2024-11-12
 
This is the data set for the article entitled "Pollinator seed mixes are phenologically dissimilar to prairie remnants," a manuscript pending publication in Restoration Ecology. This represents the core phenology data of prairie remnant and pollinator seed mixes that were used for the main analyses. Note that additional data associated with the manuscript are intended to be published as a supplement in the journal.
keywords: native plants; ecological restoration; tallgrass prairie; native plant materials
published: 2024-08-16
 
Dataset used for the paper entitled "Morphological differences between wild and game-farm Mallards in North America". Large-scale releases of domesticated, game-farm Mallards to supplement wild populations have resulted in wide-spread introgressive hybridization that changed the genetic constitution of wild populations in eastern North America. The resulting gene flow is well-documented between game-farm and wild Mallards, but the mechanistic consequences from such interactions remain unknown in North America. We provide the first study to characterize and investigate potential differences in morphology between genetically known, wild and game-farm Mallards in North America. We used nine morphological measurements to discriminate between wild and game-farm Mallards with 96% accuracy. Compared to their wild counterparts, game-farm Mallards had longer bodies and tarsi, shorter heads and wings, and shorter, wider, and taller bills. The nail on the end of the bill of game-farm Mallards was longer, and game-farm Mallard bills had a greater lamellae:bill length ratio than wild Mallards. Differences in body morphologies between wild and game-farm Mallards are consistent with an artificial, terrestrial life whereby game-farm Mallards are fed pelleted foods resulting in artificial selection for a more “goose-like” bill. We posit that 1) game-farm Mallards have diverged from their wild ancestral traits of flying and filter feeding towards becoming optimized to run and peck for food; 2) game-farm morphological traits optimized over the last 400 years in domestic environments are likely to be maladaptive in the wild; and 3) the introgression of such traits into wild populations is likely to reduce fitness. Understanding effects of game-farm Mallard introgression requires analysis of various game-farm × wild hybrid generations to determine how domestically-derived traits persist or diminish with each generation.
keywords: Mallard; Game Farm; Morphology; Waterfowl; Duck
published: 2023-07-01
 
This is the data used in the paper "Assessment of spatiotemporal flood risk due to compound precipitation extremes across the contiguous United States". Code from the Github repository https://github.com/adtonks/precip_extremes can be used with the data here to reproduce the paper's results. v1.0.0 of the code is also archived at https://doi.org/10.5281/zenodo.8104252 This dataset is derived from NOAA-CIRES-DOE 20th Century Reanalysis V3. The NOAA-CIRES-DOE Twentieth Century Reanalysis Project version 3 used resources of the National Energy Research Scientific Computing Center managed by Lawrence Berkeley National Laboratory which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 and used resources of NOAA's Remotely Deployed High Performance Computing Systems.
keywords: spatiotemporal; CONUS; United States; precipitation; extremes; flooding
published: 2023-07-05
 
This dataset contains all data used in the paper "Impact of genotype-calling methodologies on genome-wide association and genomic prediction in polyploids". The dataset includes genotypes and phenotypic data from two autotetraploid species Miscanthus sacchariflorus and Vaccinium corymbosum that was used used for genome wide association studies and genomic prediction and the scripts used in the analysis. In this V2, 2 files have the raw data are added: "Miscanthus_sacchariflorus_RADSeq.vcf" is the VCF file with the raw SNP calls of the Miscanthus sacchariflorus data used for genotype calling using the 6 genotype calling methods. "Blueberry_data_read_depths.RData" is the a RData file with the read depth data that was used for genotype calling in the Blueberry dataset.
keywords: Polyploid; allelic dosage; Bayesian genotype-calling; Genome-wide association; Genomic prediction
published: 2023-07-11
 
The dissertation_demo.zip contains the base code and demonstration purpose for the dissertation: A Conceptual Model for Transparent, Reusable, and Collaborative Data Cleaning. Each chapter has a demo folder for demonstrating provenance queries or tools. The Airbnb dataset for demonstration and simulation is not included in this demo but is available to access directly from the reference website. Any updates on demonstration and examples can be found online at: https://github.com/nikolausn/dissertation_demo
published: 2023-10-22
 
HGT+ILS datasets from Davidson, R., Vachaspati, P., Mirarab, S., & Warnow, T. (2015). Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer. BMC genomics, 16(10), 1-12. Contains model species trees, true and estimated gene trees, and simulated alignments.
keywords: evolution; computational biology; bioinformatics; phylogenetics
published: 2024-04-05
 
The following files include specimen information, DNA sequence data, and additional information on the analyses used to reconstruct the phylogeny of the leafhopper genus Neoaliturus as described in the Methods section of the original paper: 1. Taxon_sampling.csv: contains data on the individual specimens from which DNA was extracted, including sample code, taxon name, collection data (locality, date and name of collector) and museum unique identifier. 2. Alignments.zip: a ZIP archive containing 432 separate FASTA files representing the aligned nucleotide sequences of individual gene loci used in the analysis. 3. Concatenated_Matrix.fa: is a FASTA file containing the concatenated individual gene alignments used for the maximum likelihood analysis in IQ-TREE. 4. Genes_and_Loci.rtf: identifies the individual genes and loci used in the analysis. The partition name is the same as the name of the individual alignment file in the zipped Alignments folder. 5. Partitions_best_scheme.nex: is a text file in the standard NEXUS format that indicates the names of the individual data partitions and their locations in the concatenated matrix, and also indicates the substitution model for each partition. 6. (New in this version 2) Scripts & Description.zip includes 8 custom shell or perl scripts used to assemble the DNA sequence data by perform reciprocal blast searches between the reference sequences and assemblies for each sample, extract the best sequences based on the blast searches, screen the hits for each locus and keep only the best result, and generate the nucleotide sequence dataset for the predicted orthologues (see the file description.txt for details). 7. (New in this version 2) Full_genetic_distances_matrix.csv shows the genetic distances between pairs of samples in the datset (proportion of nucleotides that differ between samples).
keywords: leafhopper; phylogeny; anchored-hybrid-enrichment; DNA sequence; insect
planned publication date: 2025-04-24
 
These are the datasets underlying the figures in the manuscript "Methods of active surveillance for hard ticks and associated tick-borne pathogens of public health importance in the contiguous United States: A Comprehensive Systematic Review". The review considered only publications reporting on active tick or tick-borne pathogen surveillance in the contiguous United States published between 1944 and 2018. For the purposes of this review, we were only concerned with studies of Ixodidae (hard ticks) and/or studies of tick-borne pathogens (in humans, animals, or hard ticks) of public health importance to humans. Study designs included cross-sectional, serological, epidemiological, ecological, or observational studies. Only peer-reviewed publications published in the English language were included. Studies were excluded if they focused on a tick that is not a vector of a human pathogen or on a pathogen that does not cause disease in humans, if the tick or tick-borne pathogen findings were incidental, or if they did not include quantitative surveillance data. For the purpose of this study, we defined surveillance data as information on ticks or pathogens provided through active sampling in natural areas; it should be noted that this does not match the strict definition used by the CDC, which requires sustained sampling efforts across time. Studies were also excluded if they: explored regions other than the contiguous US; focused on treatment, vaccine, or therapeutics development and/or diagnostics of human disease; focused on tick or pathogen genetics; focused on experimental studies with ticks or hosts; were tick control and/or management studies; performed only passive surveillance; were review articles; were not peer reviewed; were in a language other than English; the full text was not available; and if the disease was not a risk to the general public. In addition, for articles which reported data that had previously been published, we only included previously unreported information collected by the authors, and we referenced the specific period of collection for these data to ensure we were not double-recording data. Due to publication delays, we also performed a non-systematic review of the literature of articles published between 2019 – 2023 on tick and tickborne pathogen surveillance methods conducted in the contiguous United States. Keyword search was performed in PubMed Central and Web of Science Core Collection databases. The search algorithm keywords included tick(s), Amblyomma, Dermacentor, Ixodes, Rhipicephalus, Acari Ixodidea, tick host(s), Lyme disease, Rocky Mountain Spotted Fever, Spotted Fever Group, Rickettsiosis, Ehrlichiosis, Anaplasmosis, Borreliosis, Tularemia, Babesiosis, tick-borne pathogen, Powassan, Heartland, Bourbon, Colorado tick fever, Pacific Coast tick fever, tick surveillance, surveillance, (sero)epidemiology, prevalence, distribution, ecology, United States. The search algorithm utilized is provided as follows: TI= ((ticks OR Ixodes OR Amblyomma OR Dermacentor OR Rhipicephalus OR "Acari Ixodidi" OR "tick hosts" OR "tick host") OR ("Lyme Disease" OR "Rocky Mountain Spotted Fever" OR "Spotted Fever Group" OR Rickettsiosis OR Rickettsial OR Ehrlichiosis OR Anaplasmosis OR Borreliosis OR Tularemia OR Babesiosis OR Borrelia OR Ehrlichia OR Anaplasma OR Rickettsia OR Babesia OR "tick-borne pathogen" OR "tick borne pathogen")) AND TS= ("tick surveillance" OR surveillance OR epidemiology OR seroepidemiology OR ecology) AND CU=("United States of America" OR "USA" OR "United States" OR United-States). These datasets are the collated data underlying the figures in the manuscript. For more details, please see the publication. The following are explanations for variables used in all the CSV files: Tick: Species of tick collected Tick_Method: Method of collecting ticks Pathogen: Species of pathogen tested for Path_Method: Method of testing for pathogens Decade: Decade of publication n: Number of publications STATE: state in which study was conducted COUNTY: county in which study was conducted 1944 - 2018 (Was surveillance performed?): was there at least one publication included with a publication date within the 1944-2018 period in this geographic region? 2019 - 2023 (Was surveillance performed?): was there at least one publication included with a publication date within the 2019-2023 period in this geographic region?
keywords: ticks; systematic review; surveillance
published: 2024-10-12
 
Simulation data used to generate plots in the associated paper ("Strain rate controls alignment in growing bacterial monolayers").
published: 2024-11-19
 
This project investigates retraction indexing agreement among data sources: Crossref, Retraction Watch, Scopus, and Web of Science. As of July 2024, this reassesses the April 2023 union list of Schneider et al. (2023): https://doi.org/10.55835/6441e5cae04dbe5586d06a5f. As of April 2023, over 1 in 5 DOIs had discrepancies in retraction indexing among the 49,924 DOIs indexed as retracted in at least one of Crossref, Retraction Watch, Scopus, and Web of Science (Schneider et al., 2023). Here, we determine what changed in 15 months. Pipeline code to get the results files can be found in the GitHub repository https://github.com/infoqualitylab/retraction-indexing-agreement in the iPython notebook 'MET-STI2024_Reassessment_of_retraction_indexing_agreement.ipynb' Some files have been redacted to remove proprietary data, as noted in README.txt. Among our sources, data is openly available only for Crossref and Retraction Watch. FILE FORMATS: 1) unionlist_completed_2023-09-03-crws-ressess.csv - UTF-8 CSV file 2) unionlist_completed-ria_2024-07-09-crws-ressess.csv - UTF-8 CSV file 3) unionlist-15months-period_sankey.png - Portable Network Graphics (PNG) file 4) unionlist_ria_proportion_comparison.png - Portable Network Graphics (PNG) file 5) README.txt - text file FILE DESCRIPTION: Description of the files can be found in README.txt
keywords: retraction status; data quality; indexing; retraction indexing; metadata; meta-science; RISRS
published: 2023-01-05
 
This is the data used in the paper "Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data". A preprint may be found at https://doi.org/10.48550/arXiv.2212.11367 Code from the Github repository https://github.com/adtonks/mosquito_GNN can be used with the data here to reproduce the paper's results. v1.0.0 of the code is also archived at https://doi.org/10.5281/zenodo.7897830
keywords: west nile virus; machine learning; gnn; mosquito; trap; graph neural network; illinois; geospatial
published: 2023-04-12
 
The XSEDE program manages the database of allocation awards for the portfolio of advanced research computing resources funded by the National Science Foundation (NSF). The database holds data for allocation awards dating to the start of the TeraGrid program in 2004 through the XSEDE operational period, which ended August 31, 2022. The project data include lead researcher and affiliation, title and abstract, field of science, and the start and end dates. Along with the project information, the data set includes resource allocation and usage data for each award associated with the project. The data show the transition of resources over a fifteen year span along with the evolution of researchers, fields of science, and institutional representation. Because the XSEDE program has ended, the allocation_award_history file includes all allocations activity initiated via XSEDE processes through August 31, 2022. The Resource Providers and successor program to XSEDE agreed to honor all project allocations made during XSEDE. Thus, allocation awards that extend beyond the end of XSEDE may not reflect all activity that may ultimately be part of the project award. Similarly, allocation usage data only reflects usage reported through August 31, 2022, and may not reflect all activity that may ultimately be conducted by projects that were active beyond XSEDE.
keywords: allocations; cyberinfrastructure; XSEDE
published: 2023-05-30
 
Primary occurrence data for Clem, Hart, & McElrath. 2023. A century of Illinois hover flies (Diptera: Syrphidae): Museum and citizen science data reveal recent range expansions, contractions, and species of potential conservation significance. Included are a license.txt file, the cleaned occurrences from each of the six merged datasets, and a cleaned, merged dataset containing all occurrence records in one spreadsheet, formatted according to Darwin Core standards, with a few extra fields such as GBIF identifiers that were included in some of the original downloads.
keywords: csv; occurrences; syrphidae; hover flies; flies; biodiversity; darwin core; darwin-core; GBIF; citizen science; iNaturalist
published: 2023-06-06
 
This dataset is derived from the COCI, the OpenCitations Index of Crossref open DOI-to-DOI references (opencitations.net). Silvio Peroni, David Shotton (2020). OpenCitations, an infrastructure organization for open scholarship. Quantitative Science Studies, 1(1): 428-444. https://doi.org/10.1162/qss_a_00023 We have curated it to remove duplicates, self-loops, and parallel edges. These data were copied from the Open Citations website on May 6, 2023 and subsequently processed to produce a node list and an edge-list. Integer_ids have been assigned to the DOIs to reduce memory and storage needs when working with these data. As noted on the Open Citation website, each record is a citing-cited pair that uses DOIs as persistent identifiers.
keywords: open citations; bibliometrics; citation network; scientometrics
published: 2024-02-08
 
This dataset contains transcribed entries from the "Prairie Directory of North America" (Adelman and Schwartz 2013) for the Tallgrass, Mixed Grass, and Shortgrass prairie regions of the united states. We identified the historical spatial extent of the Tallgrass, Mixed Grass, and Shortgrass prairie regions using Ricketts et al. (1999), Olson et al. (2001), and Dixon et al. (2014) and selected the counties entirely or partially within these boundaries from the USDA Forest Service (2022) file. The resulting lists of counties are included as separate files. The dataset contains information on publicly accessible grasslands and prairies in these regions including acreage and amenities like hunting access, restrooms, parking, and trails.
keywords: grasslands; prairies; prairie directory of north america; site amenities; site attributes
published: 2024-09-17
 
The following seven zip files are compressed folders containing the input datasets/trees, main output files and the scripts of the related analyses performed in this study. I. ancestral_microhabitat_reconstruction.zip: contains four files, including two input files (microhabitats.csv, timetree.tre) and a script (simmap_microhabitat.R) for ancestral states reconstruction of microhabitat by make.simmap implemented in the R package phytools v1.5, as well as the main output file (ancestral_microhabitats.csv). 1. ancestral_microhabitats.csv: reconstructed ancestral microhabitats for each node. 2. microhabitats.csv: microhabitats of the studies species. 3. simmap_microhabitat.R: the R script of make.simmap for ancestral microhabitat reconstruction 4. timetree.tre: dated tree used for ancestral state reconstruction for microhabitat and morphological characters II. ancestral_morphology_reconstruction.zip: contains six files, including an input file (morphology.csv) and a script (simmap_morphology.R) for ancestral states reconstruction of morphology by make.simmap implemented in the R package phytools v1.5, as well as four main output files(forewing_ancestral_state.csv, frontal_sutures_ancestral_state.csv, hind_wing_ancestral_state.csv, ocellus_ancestral_state.csv). 1. forewing_ancestral_state.csv: reconstructed ancestral states of the development of the forewing for each node. 2. frontal_sutures_ancestral_state.csv: reconstructed ancestral states of the development of frontal sutures for each node. 3. hind_wing_ancestral_state.csv: reconstructed ancestral states of the development of the hind wing for each node. 4. morphology.csv: the states of the development of ocellus, forewing, hing wing and frontal sutures for each studies species. 5. ocellus_ancestral_state.csv: reconstructed ancestral states of the development of the ocellus for each node. 6. simmap_morphology.R: the R script of make.simmap for ancestral state reconstruction of morphology III. biogeographic_reconstruction.zip: contains four files, including three input files (dispersal_probablity.txt, distributions.csv, timetree_noOutgroup.tre) used for a stratified biogeographic analysis by BioGeoBEARS in RASP v4.2 and the main output file (DIVELIKE_result.txt). 1. dispersal_probablity.txt: relative dispersal probabilities among biogeographical regions at different geological epochs. 2. distributions.csv: current distributions of the studied species. 3. DIVELIKE_result.txt: BioGeoBEARS result of ancestral areas based on the DIVELIKE model. 4. timetree_noOutgroup.tre: the dated tree with the outgroup lineage (Eurymelinae) excluded. IV. coalescent_analysis.zip: contains a folder and two files, including a folder (individual_gene_alignment) of input files used to construct gene trees, an input file (MLtree_BS70.tre) used for the multi-species coalescent analysis by ASTRAL v 4.10.5 and the main output file (coalescent_species_tree.tre). 1. coalescent_species_tree.tre: the species tree generated by the multi-species coalescent analysis with the quartet support, effective number of genes and the local posterior probability indicated. 2. individual_gene_alignment: a folder containing 427 FASTA files, each one represents the nucleotide alignment for a gene. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12. 3. MLtree_BS70.tre: 165 gene trees with the average SH-aLRT and ultrafast bootstrap values of ≥ 70%. This file was used to estimate the species tree by ASTRAL v 4.10.5. V. divergence_time_estimation.zip: contains five files, including two input files (treefile_rooted_noBranchLength.tre, treefile_rooted.tre) and two control files (baseml.ctl, mcmctree.ctl) used for divergence time estimation by BASEML and MCMCTREE in PAML v4.9, as well as the main output file (timetree_with95%HPD.tre). 1. baseml.ctl: the control file used for the estimation of substitution rates by BASEML in PAML v4.9. 2. mcmctree.ctl: the control file used for the estimation of divergence times by MCMCTREE in PAML v4.9. 3. timetree_with95%HPD.tre: dated tree with the 95% highest posterior density confidence intervals indicated. 4. treefile_rooted_noBranchLength.tre: the maximum likelihood tree based on the concatenated nucleotide dataset with calibrations for the crown and internal nodes. Branch length and support values were not indicated. 5. treefile_rooted.tre: the maximum likelihood tree based on the concatenated nucleotide dataset with a secondary calibration on the root age. Branch support values were not indicated. VI. maximum_likelihood_analysis_aa.zip: contains three files, including two input files (concatenated_aa_partition.nex, concatenated_aa.phy) used for the maximum likelihood analysis by IQ-TREE v1.6.12 and the main output file (MLtree_aa.tre). 1. concatenated_aa_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_aa.phy. This file partitions the 52,024 amino acid positions into 427 character sets. 2. concatenated_aa.phy: a concatenated amino acid dataset with 52,024 amino acid positions. Hyphens are used to represent gaps. This dataset was used for the maximum likelihood analysis. 3. MLtree_aa.tre: the maximum likelihood tree based on the concatenated amino acid dataset, with SH-aLRT values and ultrafast bootstrap values indicated. VII. maximum_likelihood_analysis_nt.zip: contains three files, including two input files (concatenated_nt_partition.nex, concatenated_nt.phy) used for the maximum likelihood analysis by IQ-TREE v1.6.12 and the main output file (MLtree_nt.tre). 1. concatenated_nt_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_nt.phy. This file partitions the 156,072 nucleotide positions into 427 character sets. 2. concatenated_nt.phy: a concatenated nucleotide dataset with 156,072 nucleotide positions. Hyphens are used to represent gaps. This dataset was used for the maximum likelihood analysis as well as divergence time estimation. 3. MLtree_nt.tre: the maximum likelihood tree based on the concatenated nucleotide dataset, with SH-aLRT values and ultrafast bootstrap values indicated. VIII. Taxon_sampling.csv: contains the sample IDs (1st column) which were used in the alignments and the taxonomic information (2nd to 6th columns).
keywords: Anchored Hybrid Enrichment, Biogeography, Cicadellidae, Phylogenomics, Treehoppers
published: 2018-05-21
 
This dataset contains bonding networks and tolerance ranges for geometric magnetic dimensionality. The data can be searched in the html frontend above, code obtained at the GitHub repository, or the raw data can be downloaded as csv below. The csv data contains the results of 42520 compounds (unique icsd_code) from ICSD FindIt v3.5.0. The csv is semicolon-delimited since some fields contain multiple comma-separated values.
keywords: materials science; physics; magnetism; crystallography