Illinois Data Bank - Dataset

Version DOI Comment Publication Date
3 10.13012/B2IDB-1475719_V3 files added and updated to include more input files of analyses, related scripts, and the main outputs 2024-09-17
2 10.13012/B2IDB-1475719_V2 We updated our data matrices by excluding two samples and adding one new sample. 2024-08-16
1 10.13012/B2IDB-1475719_V1 2025-01-01

10.8 KB File
5.56 KB View File
4.69 KB File
5.06 KB File
8.83 KB File
1.95 MB File
7.01 KB File
1.92 MB File
2.94 MB File

Contact the Research Data Service for help interpreting this log.

Dataset update: {"publication_state"=>["version candidate under curator review", "released"], "release_date"=>[nil, Tue, 17 Sep 2024]} 2024-09-17T12:53:35Z
RelatedMaterial update: {"note"=>[nil, ""]} 2024-09-13T15:28:23Z
Dataset update: {"title"=>["Datasets for \"Phylogeny, Biogeography and Morphological Evolution of the Treehopper-Like Leafhoppers (Hemiptera: Cicadellidae) Megophthalminae and Ulopinae\"", "Dataset title: Datasets, scripts and main output files for \"Phylogeny, Biogeography and Morphological Evolution of the Treehopper-Like Leafhoppers (Hemiptera: Cicadellidae) Megophthalminae and Ulopinae\""], "description"=>["The following 6 files were used to reconstruct the phylogeny of the Megophthalminae and Ulopinae.\r\n1. Taxon_sampling.csv: contains the sample IDs (1st column) which were used in the alignments and the taxonomic information (2nd to 6th columns).\r\n2. concatenated_aa_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_aa.phy. This file partitions the 52,024 amino acid positions into 427 character sets.\r\n3. concatenated_aa_.phy: a concatenated amino acid dataset with 52,024 amino acid positions. This dataset was used for the maximum likelihood analysis by IQ-TREE v1.6.12. Hyphens are used to represent gaps.\r\n4. concatenated_nt_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_nt.phy. This file partitions the 156,072 nucleotide positions into 427 character sets.\r\n5. concatenated_nt_.phy: a concatenated nucleotide dataset with 156,072 nucleotide positions. This dataset was used for the maximum likelihood analysis by IQ-TREE v1.6.12. Hyphens are used to represent gaps.\r\n6. Individual_gene_alignment.zip: contains 427 FASTA files, each one represents the nucleotide alignment for a gene. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12, followed by multispecies coalescent analysis using ASTRAL v 4.10.5 based the maximum likelihood trees with the average SH-aLRT and ultrafast bootstrap values of ≥ 70.\r\n\r\n", "The following seven zip files are compressed folders containing the input datasets/trees, main output files and the scripts of the related analyses performed in this study.\r\n\r\nI. ancestral_microhabitat_reconstruction.zip: contains four files, including two input files (microhabitats.csv, timetree.tre) and a script (simmap_microhabitat.R) for ancestral states reconstruction of microhabitat by make.simmap implemented in the R package phytools v1.5, as well as the main output file (ancestral_microhabitats.csv).\r\n\r\n\t1. ancestral_microhabitats.csv: reconstructed ancestral microhabitats for each node.\r\n\r\n\t2. microhabitats.csv: microhabitats of the studies species.\r\n\t\r\n\t3. simmap_microhabitat.R: the R script of make.simmap for ancestral microhabitat reconstruction\r\n\t\r\n\t4. timetree.tre: dated tree used for ancestral state reconstruction for microhabitat and morphological characters\r\n\t\r\nII. ancestral_morphology_reconstruction.zip: contains six files, including an input file (morphology.csv) and a script (simmap_morphology.R) for ancestral states reconstruction of morphology by make.simmap implemented in the R package phytools v1.5, as well as four main output files(forewing_ancestral_state.csv, frontal_sutures_ancestral_state.csv, hind_wing_ancestral_state.csv, ocellus_ancestral_state.csv).\r\n\r\n\t1. forewing_ancestral_state.csv: reconstructed ancestral states of the development of the forewing for each node.\r\n\t\r\n\t2. frontal_sutures_ancestral_state.csv: reconstructed ancestral states of the development of frontal sutures for each node.\r\n\t\r\n\t3. hind_wing_ancestral_state.csv: reconstructed ancestral states of the development of the hind wing for each node.\r\n\t\r\n\t4. morphology.csv: the states of the development of ocellus, forewing, hing wing and frontal sutures for each studies species.\r\n\t\r\n\t5. ocellus_ancestral_state.csv: reconstructed ancestral states of the development of the ocellus for each node.\r\n\t\r\n\t6. simmap_morphology.R: the R script of make.simmap for ancestral state reconstruction of morphology\r\n\t\r\nIII. biogeographic_reconstruction.zip: contains four files, including three input files (dispersal_probablity.txt, distributions.csv, timetree_noOutgroup.tre) used for a stratified biogeographic analysis by BioGeoBEARS in RASP v4.2 and the main output file (DIVELIKE_result.txt).\r\n\r\n\t1. dispersal_probablity.txt: relative dispersal probabilities among biogeographical regions at different geological epochs.\r\n\t\r\n\t2. distributions.csv: current distributions of the studied species.\r\n\t\r\n\t3. DIVELIKE_result.txt: BioGeoBEARS result of ancestral areas based on the DIVELIKE model.\r\n\t\r\n\t4. timetree_noOutgroup.tre: the dated tree with the outgroup lineage (Eurymelinae) excluded.\r\n\t\r\nIV. coalescent_analysis.zip: contains a folder and two files, including a folder (individual_gene_alignment) of input files used to construct gene trees, an input file (MLtree_BS70.tre) used for the multi-species coalescent analysis by ASTRAL v 4.10.5 and the main output file (coalescent_species_tree.tre).\r\n\r\n\t1. coalescent_species_tree.tre: the species tree generated by the multi-species coalescent analysis with the quartet support, effective number of genes and the local posterior probability indicated.\r\n\t\r\n\t2. individual_gene_alignment: a folder containing 427 FASTA files, each one represents the nucleotide alignment for a gene. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12.\r\n\r\n\t3. MLtree_BS70.tre: 165 gene trees with the average SH-aLRT and ultrafast bootstrap values of ≥ 70%. This file was used to estimate the species tree by ASTRAL v 4.10.5.\r\n\t\r\nV. divergence_time_estimation.zip: contains five files, including two input files (treefile_rooted_noBranchLength.tre, treefile_rooted.tre) and two control files (baseml.ctl, mcmctree.ctl) used for divergence time estimation by BASEML and MCMCTREE in PAML v4.9, as well as the main output file (timetree_with95%HPD.tre).\r\n\r\n\t1. baseml.ctl: the control file used for the estimation of substitution rates by BASEML in PAML v4.9.\r\n\t\r\n\t2. mcmctree.ctl: the control file used for the estimation of divergence times by MCMCTREE in PAML v4.9.\r\n\t\r\n\t3. timetree_with95%HPD.tre: dated tree with the 95% highest posterior density confidence intervals indicated.\r\n\r\n\t4. treefile_rooted_noBranchLength.tre: the maximum likelihood tree based on the concatenated nucleotide dataset with calibrations for the crown and internal nodes. Branch length and support values were not indicated.\r\n\r\n\t5. treefile_rooted.tre: the maximum likelihood tree based on the concatenated nucleotide dataset with a secondary calibration on the root age. Branch support values were not indicated.\r\n\t\r\nVI. maximum_likelihood_analysis_aa.zip: contains three files, including two input files (concatenated_aa_partition.nex, concatenated_aa.phy) used for the maximum likelihood analysis by IQ-TREE v1.6.12 and the main output file (MLtree_aa.tre).\r\n\r\n\t1. concatenated_aa_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_aa.phy. This file partitions the 52,024 amino acid positions into 427 character sets.\r\n\r\n\t2. concatenated_aa.phy: a concatenated amino acid dataset with 52,024 amino acid positions. Hyphens are used to represent gaps. This dataset was used for the maximum likelihood analysis. \r\n\r\n\t3. MLtree_aa.tre: the maximum likelihood tree based on the concatenated amino acid dataset, with SH-aLRT values and ultrafast bootstrap values indicated.\r\n\r\nVII. maximum_likelihood_analysis_nt.zip: contains three files, including two input files (concatenated_nt_partition.nex, concatenated_nt.phy) used for the maximum likelihood analysis by IQ-TREE v1.6.12 and the main output file (MLtree_nt.tre).\r\n\r\n\t1. concatenated_nt_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_nt.phy. This file partitions the 156,072 nucleotide positions into 427 character sets.\r\n\r\n\t2. concatenated_nt.phy: a concatenated nucleotide dataset with 156,072 nucleotide positions. Hyphens are used to represent gaps. This dataset was used for the maximum likelihood analysis as well as divergence time estimation.\r\n\r\n\t3. MLtree_nt.tre: the maximum likelihood tree based on the concatenated nucleotide dataset, with SH-aLRT values and ultrafast bootstrap values indicated.\r\n\r\nVIII. Taxon_sampling.csv: contains the sample IDs (1st column) which were used in the alignments and the taxonomic information (2nd to 6th columns).\r\n"], "keywords"=>["Cicadellidae; Classification; Phylogenomics; Megophthalminae; Ulopinae", "Anchored Hybrid Enrichment, Biogeography, Cicadellidae, Phylogenomics, Treehoppers"]} 2024-09-13T01:21:38Z
Dataset update: {"hold_state"=>["version candidate under curator review", "none"]} 2024-09-12T14:50:33Z
Dataset update: {"version_comment"=>[nil, "I would like to upload more files to this dataset, including more input files of our analyses and related scripts, as well as the main outputs."]} 2024-09-12T11:51:06Z
RelatedMaterial create: {"material_type"=>"Dataset", "availability"=>nil, "link"=>"https://doi.org/10.13012/B2IDB-1475719_V2", "uri"=>"10.13012/B2IDB-1475719_V2", "uri_type"=>"DOI", "citation"=>"Cao, Yanghui; Dietrich, Christopher H.; Dmitriev, Dmitry A.; Kits, Joel H.; Xue, Qingquan; Zhang, Yalin (2024): Datasets for \"Phylogeny, Biogeography and Morphological Evolution of the Treehopper-Like Leafhoppers (Hemiptera: Cicadellidae) Megophthalminae and Ulopinae\". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1475719_V2", "dataset_id"=>2785, "selected_type"=>"Dataset", "datacite_list"=>"IsNewVersionOf", "note"=>nil, "feature"=>nil} 2024-09-12T11:48:35Z
RelatedMaterial create: {"material_type"=>"Dataset", "availability"=>nil, "link"=>"https://doi.org/10.13012/B2IDB-8842653_V2", "uri"=>"", "uri_type"=>"", "citation"=>"Cao, Y., Dietrich, C.H., Zahniser, J.N. & Dmitriev, D.A. (2022) Dense sampling of taxa and characters improves phylogenetic resolution among deltocephaline leafhoppers (Hemiptera: Cicadellidae: Deltocephalinae). Systematic Entomology, 47: 430–444.", "dataset_id"=>2785, "selected_type"=>"Dataset", "datacite_list"=>"", "note"=>"", "feature"=>nil} 2024-09-12T11:48:35Z
RelatedMaterial create: {"material_type"=>"Dataset", "availability"=>nil, "link"=>"https://doi.org/10.13012/B2IDB-8636195_V1", "uri"=>"", "uri_type"=>"", "citation"=>"Cao, Y., Dietrich, C.H., Kits, J.H., Dmitriev, D.A., Richter, R., Eyres, J., Dettman, J.R., Xu, Y. & Huang, M. (2023). Phylogenomics of microleafhoppers (Hemiptera: Cicadellidae: Typhlocybinae): morphological evolution, divergence times and biogeography. Insect Systematics and Diversity, 7(4): 1–19.", "dataset_id"=>2785, "selected_type"=>"Dataset", "datacite_list"=>"", "note"=>"", "feature"=>nil} 2024-09-12T11:48:35Z
Funder create: {"name"=>"Agriculture and Agri- Food Canada", "identifier"=>"", "identifier_scheme"=>"", "grant"=>"J-001564, J-002279", "dataset_id"=>2785, "code"=>"other"} 2024-09-12T11:48:35Z
Funder create: {"name"=>"National Natural Science Foundation of China", "identifier"=>"", "identifier_scheme"=>"", "grant"=>"32370493, 32370495, 32170472", "dataset_id"=>2785, "code"=>"other"} 2024-09-12T11:48:35Z
Funder create: {"name"=>"U.S. National Science Foundation (NSF)", "identifier"=>"10.13039/100000001", "identifier_scheme"=>"DOI", "grant"=>"DEB-1639601", "dataset_id"=>2785, "code"=>"NSF"} 2024-09-12T11:48:35Z
Creator create: {"family_name"=>"Zhang", "given_name"=>"Yalin", "identifier"=>"", "email"=>"yalinzh@nwsuaf.edu.cn", "is_contact"=>false, "row_position"=>6} 2024-09-12T11:48:35Z
Creator create: {"family_name"=>"Xue", "given_name"=>"Qingquan", "identifier"=>"", "email"=>"xueqingquan_123@nwsuaf.edu.cn", "is_contact"=>false, "row_position"=>5} 2024-09-12T11:48:35Z
Creator create: {"family_name"=>"Kits", "given_name"=>"Joel H.", "identifier"=>"", "email"=>"joel.kits@AGR.GC.CA", "is_contact"=>false, "row_position"=>4} 2024-09-12T11:48:35Z
Creator create: {"family_name"=>"Dmitriev", "given_name"=>"Dmitry A.", "identifier"=>"", "email"=>"arboridia@gmail.com", "is_contact"=>false, "row_position"=>3} 2024-09-12T11:48:35Z
Creator create: {"family_name"=>"Dietrich", "given_name"=>"Christopher H.", "identifier"=>"", "email"=>"chdietri@illinois.edu", "is_contact"=>false, "row_position"=>2} 2024-09-12T11:48:35Z
Creator create: {"family_name"=>"Cao", "given_name"=>"Yanghui", "identifier"=>"", "email"=>"caoyh@illinois.edu", "is_contact"=>true, "row_position"=>1} 2024-09-12T11:48:35Z
Dataset update: {"corresponding_creator_name"=>[nil, "Yanghui Cao"], "corresponding_creator_email"=>[nil, "caoyh@illinois.edu"]} 2024-09-12T11:48:35Z