Illinois Data Bank - Dataset

Version DOI Comment Publication Date
1 10.13012/B2IDB-4354331_V1 2018-04-19

2.98 KB File
3.48 GB File

Contact the Research Data Service for help interpreting this log.

RelatedMaterial create: {"material_type"=>"Article", "availability"=>nil, "link"=>"https://pnigel.com/papers/jiang-inpress-BB4ZDSLK.pdf", "uri"=>"https://pnigel.com/papers/jiang-inpress-BB4ZDSLK.pdf", "uri_type"=>"URL", "citation"=>"Jiang, X., Bosch, N., & Torvik, V.I. 2024. Training a Geographic Entity Recognizer on Biomedical Abstracts with the Aid of Embeddings, Metadata, and Linked Data. ", "dataset_id"=>527, "selected_type"=>"Article", "datacite_list"=>"IsSupplementTo", "note"=>nil, "feature"=>nil} 2024-11-19T20:09:59Z
RelatedMaterial update: {"note"=>[nil, ""]} 2024-11-19T20:09:59Z
RelatedMaterial update: {"note"=>[nil, ""]} 2024-11-19T20:09:59Z
RelatedMaterial update: {"note"=>[nil, ""]} 2024-11-19T20:09:59Z
RelatedMaterial update: {"note"=>[nil, ""]} 2024-11-19T20:09:59Z
RelatedMaterial update: {"note"=>[nil, ""]} 2024-11-19T20:09:59Z
Dataset update: {"publisher"=>["University of Illinois at Urbana-Champaign", "University of Illinois Urbana-Champaign"]} 2024-11-19T20:09:59Z
RelatedMaterial update: {"link"=>["ttps://doi.org/10.1145/ 3383583.3398618", "https://doi.org/10.1145/3383583.3398618"]} 2020-08-10T16:52:57Z
RelatedMaterial create: {"material_type"=>"Article", "availability"=>nil, "link"=>"ttps://doi.org/10.1145/ 3383583.3398618", "uri"=>"10.1145/ 3383583.3398618", "uri_type"=>"DOI", "citation"=>"Xiaoliang Jiang, and Vetle Torvik. 2020. On the Ambiguity and Relevance\r\nof Place Names in Scientific. In Proceedings of JCDL 2020: ACM/IEEE-CS\r\nJoint Conference on Digital Libraries (JCDL '20), August 1-5, 2020, Wuhan,\r\nHuBei, China. ACM, New York, NY, USA. https://doi.org/10.1145/\r\n3383583.3398618", "dataset_id"=>527, "selected_type"=>"Article", "datacite_list"=>"IsSupplementTo"} 2020-08-10T16:52:03Z
RelatedMaterial create: {"material_type"=>"Article", "availability"=>nil, "link"=>"https://arxiv.org/ftp/arxiv/papers/2005/2005.04308.pdf", "uri"=>"https://arxiv.org/ftp/arxiv/papers/2005/2005.04308.pdf", "uri_type"=>"URL", "citation"=>"Xu, J., Kim, S., Song, M., Jeong, M., Kim, D., Kang, J., Rousseau, J.F., Li, X., Xu, W., Torvik, V.I., Bu, Y., Chen, C., Ebeid, I.A., Li, D., & Ding, Y. (2020). Building a PubMed knowledge graph. ArXiv, abs/2005.04308.", "dataset_id"=>527, "selected_type"=>"Article", "datacite_list"=>"IsCitedBy"} 2020-05-18T22:11:17Z
RelatedMaterial create: {"material_type"=>"Article", "availability"=>nil, "link"=>"https://doi.org/10.1371/journal.pone.0195773", "uri"=>"10.1371/journal.pone.0195773", "uri_type"=>"DOI", "citation"=>"Mishra S, Fegley BD, Diesner J, Torvik VI (2018) Self-citation is the hallmark of productive authors, of any gender. PLoS ONE 13(9): e0195773. https://doi.org/10.1371/journal.pone.0195773", "dataset_id"=>527, "selected_type"=>"Article", "datacite_list"=>"IsSupplementTo"} 2018-09-29T15:46:19Z
RelatedMaterial update: {"datacite_list"=>["IsSupplementedBy ", "IsSupplementTo"]} 2018-09-29T15:46:19Z
Dataset update: {"description"=>["MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nThe dataset contains 37,406,692 rows. Each row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n", "MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik 2018-04-05\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nThe dataset contains 37,406,692 rows. Each row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n"]} 2018-04-23T19:35:04Z
Dataset update: {"description"=>["MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nThe dataset contains 37,406,692 rows. Each row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited.\r\nAll columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n", "MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nThe dataset contains 37,406,692 rows. Each row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n"]} 2018-04-23T18:41:57Z
Dataset update: {"description"=>["MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nEach row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited.\r\nAll columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n", "MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nThe dataset contains 37,406,692 rows. Each row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited.\r\nAll columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n"]} 2018-04-23T18:41:27Z
Dataset update: {"description"=>["MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nEach row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following twelve columns, tab-delimited.\r\nAll columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n", "MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nEach row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited.\r\nAll columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n"]} 2018-04-23T18:24:46Z
RelatedMaterial create: {"material_type"=>"Article", "availability"=>nil, "link"=>"https://doi.org/10.1186/s41182-017-0073-6", "uri"=>"10.1186/s41182-017-0073-6", "uri_type"=>"DOI", "citation"=>"Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33", "dataset_id"=>527, "selected_type"=>"Article", "datacite_list"=>"IsSupplementedBy "} 2018-04-23T18:23:30Z
RelatedMaterial create: {"material_type"=>"Article", "availability"=>nil, "link"=>"https://doi.org/10.1045/november2015-torvik", "uri"=>"10.1045/november2015-torvik", "uri_type"=>"DOI", "citation"=>"Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p", "dataset_id"=>527, "selected_type"=>"Article", "datacite_list"=>"IsSupplementedBy"} 2018-04-23T18:23:30Z
Dataset update: {"description"=>["MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide\r\nprepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\nHow was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions:\r\nhttps://www.nlm.nih.gov/databases/download/pubmed_medline.html\r\n\r\nAffiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\nAffiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\nAll affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\nTorvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p\r\n\r\nLook for Fig. 4 in the following article for coverage statistics over time:\r\nPalmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\nThe code and back-end data is periodically updated and made available for query by PMID here\r\nhttp://abel.ischool.illinois.edu\r\n\r\nWhat is the format of the dataset?\r\nEach row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following twelve columns, tab-delimited.\r\nAll columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. lat: at most 3 decimals (only available when city is not a country or state)\r\n11. lon: at most 3 decimals (only available when city is not a country or state)\r\n12. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n", "MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik April 5, 2018\r\n\r\nThe dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed.\r\n\r\n&bull; How was the dataset created?\r\nThe dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.\r\nCheck here for information to get PubMed/MEDLINE, and NLMs data <a href =\"https://www.nlm.nih.gov/databases/download/pubmed_medline.html\">Terms and Conditions</a> \r\n\r\n&bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only.\r\nHowever, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220).\r\n\r\n&bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records.\r\n\r\n&bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in:\r\n<i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i>\r\n\r\n&bull; Look for <a href=\"https://doi.org/10.1186/s41182-017-0073-6\">Fig. 4</a> in the following article for coverage statistics over time:\r\n<i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i>\r\nExpect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014.\r\n\r\n&bull; The code and back-end data is periodically updated and made available for query by PMID at <a href=\"http://abel.ischool.illinois.edu/\">Torvik Research Group</a>\r\n\r\n&bull; What is the format of the dataset?\r\nEach row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following twelve columns, tab-delimited.\r\nAll columns are ASCII, except city which contains Latin-1.\r\n\r\n1. PMID: positive non-zero integer; int(10) unsigned\r\n2. au_order: positive non-zero integer; smallint(4)\r\n3. lastname: varchar(80)\r\n4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed\r\n5. year of publication:\r\n6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK\r\n7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|'\r\n8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA)\r\n9. country\r\n10. journal\r\n11. lat: at most 3 decimals (only available when city is not a country or state)\r\n12. lon: at most 3 decimals (only available when city is not a country or state)\r\n13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find\r\n"], "keywords"=>["", "PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution"], "version_comment"=>[nil, ""], "subject"=>["", "Social Sciences"]} 2018-04-23T18:23:30Z