Hype - PubMed dataset
Dataset Description |
Hype - PubMed dataset
This dataset captures ‘Hype’ within biomedical abstracts sourced from PubMed. The selection chosen is ‘journal articles’ written in English, published between 1975 and 2019, totaling ~5.2 million. The classification relies on the presence of specific candidate ‘hype words’ and their abstract location. Therefore, each article (PMID) might have multiple instances in the dataset due to the presence of multiple hype words in different abstract sentences. The candidate hype words are 35 in count: 'major', 'novel', 'central', 'critical', 'essential', 'strongly', 'unique', 'promising', 'markedly', 'excellent', 'crucial', 'robust', 'importantly', 'prominent', 'dramatically', 'favorable', 'vital', 'surprisingly', 'remarkably', 'remarkable', 'definitive', 'pivotal', 'innovative', 'supportive', 'encouraging', 'unprecedented', 'enormous', 'exceptional', 'outstanding', 'noteworthy', 'creative', 'assuring', 'reassuring', 'spectacular', and 'hopeful’. This is version 2 of the dataset. Changes include: Added “Year” variable.
File 1: hype_dataset_final.tsv Primary dataset. It has the following columns: 1. PMID: represents unique article ID in PubMed
File 2: hype_removed_phrases_final.tsv Secondary dataset with same columns as File 1.
1. Major: histocompatibility, component, protein, metabolite, complex, surgery
|
Subject |
Social Sciences |
Keywords |
Hype; PubMed; Abstracts; Biomedicine |
License |
CC BY |
Corresponding Creator |
Apratim Mishra |
Downloaded |
175 times |
| Version | DOI | Comment | Publication Date |
|---|---|---|---|
| 3 | 10.13012/B2IDB-0651259_V3 | Include new data | 2025-03-14 |
| 2 | 10.13012/B2IDB-0651259_V2 | The dataset was modified due to revision requirements from the journal of submission. | 2025-01-29 |
| 1 | 10.13012/B2IDB-0651259_V1 | 2024-03-09 |
Contact the Research Data Service for help interpreting this log.
| RelatedMaterial | create: {"material_type"=>"Dataset", "availability"=>nil, "link"=>"https://doi.org/10.13012/B2IDB-0651259_V3", "uri"=>"10.13012/B2IDB-0651259_V3", "uri_type"=>"DOI", "citation"=>" (2025): Hype - PubMed dataset. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-0651259_V3", "dataset_id"=>2865, "selected_type"=>"Dataset", "datacite_list"=>"IsPreviousVersionOf", "note"=>nil, "feature"=>nil} | 2025-03-13T04:35:07Z |