Version DOI Comment Publication Date
1 10.13012/B2IDB-5060298_V1 2018-04-23
542 MB File
3.49 KB File
35.4 MB File
8.35 GB File

Contact the Research Data Service for help interpreting this log.

update: {"description"=>["Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The file contains 22,349,417 rows and 6 columns, as follow:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years. The file contains 1,102,831 rows and 5 columns, as follow:\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: < a href=\"https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html\">https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a>\r\n* MeSH tree 2015: <a href=\"ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\">ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/</a>\r\n* Source code provided at: <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck <a href=\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">here </a>for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\n\r\nAdditional data related updates can be found at: <a href=\"http://abel.ischool.illinois.edu\">Torvik Research Group</a>\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from <a href=\"https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490\">NIH grant P01AG039347 </a> and <a href=\"http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742\">NSF grant 1348742 </a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n", "Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The file contains 22,349,417 rows and 6 columns, as follow:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years. The file contains 1,102,831 rows and 5 columns, as follow:\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: <a href=\"https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html\">https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a>\r\n* MeSH tree 2015: <a href=\"ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\">ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/</a>\r\n* Source code provided at: <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck <a href=\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">here </a>for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\n\r\nAdditional data related updates can be found at: <a href=\"http://abel.ischool.illinois.edu\">Torvik Research Group</a>\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from <a href=\"https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490\">NIH grant P01AG039347 </a> and <a href=\"http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742\">NSF grant 1348742 </a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n"]} 2018-04-27T17:28:22Z
update: {"description"=>["Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The file contains 22,349,417 rows and 6 columns, as follow:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years. The file contains 1,102,831 rows and 5 columns, as follow:\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: < a herf=\"https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html\">https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a>\r\n* MeSH tree 2015: <a href=\"ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\">ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/</a>\r\n* Source code provided at: <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck <a href=\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">here </a>for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\n\r\nAdditional data related updates can be found at: <a href=\"http://abel.ischool.illinois.edu\">Torvik Research Group</a>\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from <a href=\"https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490\">NIH grant P01AG039347 </a> and <a href=\"http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742\">NSF grant 1348742 </a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n", "Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The file contains 22,349,417 rows and 6 columns, as follow:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years. The file contains 1,102,831 rows and 5 columns, as follow:\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: < a href=\"https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html\">https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a>\r\n* MeSH tree 2015: <a href=\"ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\">ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/</a>\r\n* Source code provided at: <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck <a href=\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">here </a>for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\n\r\nAdditional data related updates can be found at: <a href=\"http://abel.ischool.illinois.edu\">Torvik Research Group</a>\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from <a href=\"https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490\">NIH grant P01AG039347 </a> and <a href=\"http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742\">NSF grant 1348742 </a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n"]} 2018-04-27T17:27:03Z
update: {"description"=>["Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The file contains 22,349,417 rows and 6 columns, as follow:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years. The file contains 1,102,831 rows and 5 columns, as follow:\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: < a herf=\"https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html>https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a>\r\n* MeSH tree 2015: <a href=\"ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\">ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/</a>\r\n* Source code provided at: <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck <a href=\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">here </a>for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\n\r\nAdditional data related updates can be found at: <a href=\"http://abel.ischool.illinois.edu\">Torvik Research Group</a>\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from <a href=\"https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490\">NIH grant P01AG039347 </a> and <a href=\"http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742\">NSF grant 1348742 </a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n", "Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The file contains 22,349,417 rows and 6 columns, as follow:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years. The file contains 1,102,831 rows and 5 columns, as follow:\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: < a herf=\"https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html\">https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a>\r\n* MeSH tree 2015: <a href=\"ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\">ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/</a>\r\n* Source code provided at: <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck <a href=\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">here </a>for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\n\r\nAdditional data related updates can be found at: <a href=\"http://abel.ischool.illinois.edu\">Torvik Research Group</a>\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from <a href=\"https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490\">NIH grant P01AG039347 </a> and <a href=\"http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742\">NSF grant 1348742 </a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n"]} 2018-04-27T17:26:37Z
create: {"material_type"=>"Article", "availability"=>nil, "link"=>"https://doi.org/10.1045/september2016-mishra", "uri"=>"10.1045/september2016-mishra", "uri_type"=>"DOI", "citation"=>"Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : The Magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra", "dataset_id"=>528, "selected_type"=>"Article", "datacite_list"=>"IsSupplementTo"} 2018-04-27T17:25:53Z
update: {"description"=>["Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The columns are as follows:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html\r\n* MeSH tree 2015: ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\r\n* Source code provided at: https://github.com/napsternxg/Novelty\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\nhttps://www.nlm.nih.gov/databases/download/pubmed_medline.html\r\n\r\n\r\nAdditional data related updates can be found at: http://abel.ischool.illinois.edu\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from NIH grant P01AG039347 (https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490) and NSF grant 1348742 (http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at https://github.com/napsternxg/Novelty\r\n", "Conceptual novelty analysis data based on PubMed Medical Subject Headings\r\n----------------------------------------------------------------------\r\nCreated by Shubhanshu Mishra, and Vetle I. Torvik on April 16th, 2018\r\n\r\n## Introduction\r\n\r\nThis is a dataset created as part of the publication titled: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : the magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra.\r\nIt contains final data generated as part of our experiments based on MEDLINE 2015 baseline and MeSH tree from 2015.\r\nThe dataset is distributed in the form of the following tab separated text files: \r\n\r\n* PubMed2015_NoveltyData.tsv - Novelty scores for each paper in PubMed. The file contains 22,349,417 rows and 6 columns, as follow:\r\n\t- PMID: PubMed ID\r\n\t- Year: year of publication\r\n\t- TimeNovelty: time novelty score of the paper based on individual concepts (see paper)\r\n\t- VolumeNovelty: volume novelty score of the paper based on individual concepts (see paper)\r\n\t- PairTimeNovelty: time novelty score of the paper based on pair of concepts (see paper)\r\n\t- PairVolumeNovelty: volume novelty score of the paper based on pair of concepts (see paper)\r\n\r\n* mesh_scores.tsv - Temporal profiles for each MeSH term for all years. The file contains 1,102,831 rows and 5 columns, as follow:\r\n\t- MeshTerm: Name of the MeSH term\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH term in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH term in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH term in the given year\r\n\r\n* meshpair_scores.txt.gz (36 GB uncompressed) - Temporal profiles for each MeSH term for all years\r\n\t- Mesh1: Name of the first MeSH term (alphabetically sorted)\r\n\t- Mesh2: Name of the second MeSH term (alphabetically sorted)\r\n\t- Year: year\r\n\t- AbsVal: Total publications with that MeSH pair in the given year\r\n\t- TimeNovelty: age (in years since first publication) of MeSH pair in the given year\r\n\t- VolumeNovelty: : age (in number of papers since first publication) of MeSH pair in the given year\r\n\r\n* README.txt file\r\n\r\n## Dataset creation\r\n\r\nThis dataset was constructed using multiple datasets described in the following locations:\r\n* MEDLINE 2015 baseline: < a herf=\"https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html>https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a>\r\n* MeSH tree 2015: <a href=\"ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/\">ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/</a>\r\n* Source code provided at: <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n\r\n\r\nNote: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck <a href=\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">here </a>for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\n\r\nAdditional data related updates can be found at: <a href=\"http://abel.ischool.illinois.edu\">Torvik Research Group</a>\r\n\r\n## Acknowledgments\r\n\r\nThis work was made possible in part with funding to VIT from <a href=\"https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490\">NIH grant P01AG039347 </a> and <a href=\"http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742\">NSF grant 1348742 </a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\r\n\r\n## License\r\n\r\nConceptual novelty analysis data based on PubMed Medical Subject Headings by Shubhanshu Mishra, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License.\r\nPermissions beyond the scope of this license may be available at <a href=\"https://github.com/napsternxg/Novelty\">https://github.com/napsternxg/Novelty</a>\r\n"], "version_comment"=>[nil, ""]} 2018-04-27T17:25:53Z
update: {"release_date"=>[nil, Mon, 23 Apr 2018], "publication_state"=>["draft", "released"], "identifier"=>["", "10.13012/B2IDB-5060298_V1"]} 2018-04-23T20:58:23Z