Illinois Data Bank - Dataset

Version DOI Comment Publication Date
1 10.13012/B2IDB-1424746_V1 2018-07-29

3.85 KB File
2.05 GB File
1.92 GB File
528 MB File
2.11 GB File
1.77 GB File
530 MB File
945 MB File
580 MB File
306 MB File
1.28 GB File
1.07 GB File
211 MB File
859 MB File
432 MB File
5.67 MB File
4.03 MB File
1.01 GB File
864 MB File
217 MB File
1.04 GB File
835 MB File
219 MB File
918 MB File
703 MB File
205 MB File
992 MB File
796 MB File
194 MB File
543 MB File
496 MB File
234 KB View File
4.48 MB File
1.14 MB File
717 KB File
23.4 KB File
1.67 MB File

Contact the Research Data Service for help interpreting this log.

Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.\r\n\r\nIn Supplementary Table S2 (njmerge-supplementary-materials.pdf), the alpha parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn README.txt, lines 37 and 38 should read:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\nNote that the file names (fasttree-exon.tre and fasttree-intron.tre) are swapped.\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.\r\n\r\nIn njmerge-supplementary-materials.pdf, the alpha parameter shown in Supplementary Table S2 is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-21T03:23:26Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.\r\n\r\nIn Supplementary Table S2 (njmerge-supplementary-materials.pdf), the alpha parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-21T03:18:43Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene, a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of the uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-21T03:16:43Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the normalized symmetric difference is always greater than or equal to the normalized Robinson-Foulds distance, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-21T03:15:14Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This could impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-21T03:14:18Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the \"symmetric difference error rate\" as the \"Robinson-Foulds error rate\". Because the normalized symmetric difference and the normalized Robinson-Foulds distance are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-21T03:13:42Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution .\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution.\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-20T16:20:56Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.\r\n\r\nFinally, in the supplement, we refer to alpha in Table S2; however, this parameter is actually the divisor D, which is used to compute alpha for each gene as follows.\r\n1. For each gene a random value X between 0 and 1 is drawn from a uniform distribution .\r\n2. Alpha is computed as -log(X) / D, where D is 4.2 for exons, 1.0 for UCEs, and 0.4 for introns (as stated in Table S2).\r\nNote that because the mean of a uniform distribution (between 0 and 1) is 0.5, the mean alpha value is -log(0.5) / 4.2 = 0.16 for exons, -log(0.5) / 1.0 = 0.69 for UCEs, and -log(0.5) / 0.4 = 1.73 for introns."]} 2019-04-20T16:14:57Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py script incorrectly refers to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py and the compare_tree_lists.py scripts incorrectly refer to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative."]} 2019-03-12T15:10:42Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py script incorrectly refers to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This might impact the gene tree error rates reported in the study, as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative.", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py script incorrectly refers to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This can impact the gene tree error rates reported in the study (see data-gene-trees.csv in data.zip), as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative."]} 2019-03-11T12:41:29Z
Dataset update: {"description"=>["This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge. When downloading datasets, please note that there is an error in the README; the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge.\r\n\r\n***When downloading datasets, please note that the following errors.***\r\n\r\nIn the README, the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre\r\n\r\nIn tools.zip, the compare_trees.py script incorrectly refers to the symmetric difference rate as the Robinson-Foulds error rate. Because the symmetric difference rate and the Robinson-Foulds error rate are equal for binary trees, this does not impact the species tree error rates reported in the study. This might impact the gene tree error rates reported in the study, as FastTree-2 returns trees with polytomies whenever 3 or more sequences in the input alignment are identical. Note that the symmetric difference rate is always greater than or equal to the Robinson-Foulds error rate, so the gene tree error rates reported in the study are more conservative."]} 2019-03-11T12:37:29Z
Funder create: {"name"=>"U.S. National Science Foundation (NSF)", "identifier"=>"10.13039/100000001", "identifier_scheme"=>"DOI", "grant"=>"DGE-1144245", "dataset_id"=>628, "code"=>"NSF"} 2019-02-07T17:47:44Z
Funder create: {"name"=>"U.S. National Science Foundation (NSF)", "identifier"=>"10.13039/100000001", "identifier_scheme"=>"DOI", "grant"=>"CCF-1535977", "dataset_id"=>628, "code"=>"NSF"} 2019-02-07T17:47:44Z
RelatedMaterial create: {"material_type"=>"Conference paper", "availability"=>nil, "link"=>"https://doi.org/10.1007/978-3-030-00834-5_15", "uri"=>"10.1007/978-3-030-00834-5_15", "uri_type"=>"DOI", "citation"=>"Molloy E.K., Warnow T. (2018) NJMerge: A Generic Technique for Scaling Phylogeny Estimation Methods and Its Application to Species Trees. In: Blanchette M., Ouangraoua A. (eds) Comparative Genomics. RECOMB-CG 2018. Lecture Notes in Computer Science, vol 11183. Springer, Cham", "dataset_id"=>628, "selected_type"=>"Other", "datacite_list"=>"IsSupplementTo"} 2018-10-09T18:23:17Z
Dataset update: {"keywords"=>["phylogenomics, species trees, incomplete lineage sorting, divide-and-conquer", "phylogenomics; species trees; incomplete lineage sorting; divide-and-conquer"], "version_comment"=>[nil, ""], "subject"=>[nil, "Life Sciences"]} 2018-07-30T21:44:20Z
Dataset update: {"description"=>["", "This repository includes scripts, datasets, and supplementary materials for the study, \"NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees\", presented at RECOMB-CG 2018. The supplementary figures and tables referenced in the main paper can be found in njmerge-supplementary-materials.pdf. The latest version of NJMerge can be downloaded from Github: https://github.com/ekmolloy/njmerge. When downloading datasets, please note that there is an error in the README; the file names on lines 37/38 should be switched so that the README reads:\r\n + fasttree-exon.tre contains lines 1-25, 1-100, or 1-1000 of fasttree-total.tre\r\n + fasttree-intron.tre contains lines 26-50, 101-200, or 1001-2000 of fasttree-total.tre"]} 2018-07-30T17:05:25Z