TextTransfer: Datasets for Impact Detection
Dataset Description |
Impact assessment is an evolving area of research that aims at measuring and predicting the potential effects of projects or programs. Measuring the impact of scientific research is a vibrant subdomain, closely intertwined with impact assessment. A recurring obstacle pertains to the absence of an efficient framework which can facilitate the analysis of lengthy reports and text labeling. To address this issue, we propose a framework for automatically assessing the impact of scientific research projects by identifying pertinent sections in project reports that indicate the potential impacts. We leverage a mixed-method approach, combining manual annotations with supervised machine learning, to extract these passages from project reports. This is a repository to save datasets and codes related to this project. Please read and cite the following paper if you would like to use the data:
This folder contains the following files:
Data processing codes can be found at: https://github.com/khan1792/texttransfer |
Subject |
Social Sciences |
Keywords |
impact detection; project reports; annotation; mixed-methods; machine learning |
License |
CC0 |
Funder |
German Federal Ministry of Education and Research-Grant:01IO1634 |
Corresponding Creator |
Kanyao Han |
Downloaded |
459 times |
| Version | DOI | Comment | Publication Date |
|---|---|---|---|
| 1 | 10.13012/B2IDB-9934303_V1 | 2024-03-21 |
Contact the Research Data Service for help interpreting this log.
| RelatedMaterial | create: {"material_type"=>"Conference paper", "availability"=>nil, "link"=>"https://aclanthology.org/2024.lrec-main.424", "uri"=>"https://aclanthology.org/2024.lrec-main.424", "uri_type"=>"URL", "citation"=>"Maria Becker, Kanyao Han, Antonina Werthmann, Rezvaneh Rezapour, Haejin Lee, and Jana Diesner. 2024. Detecting Impact Relevant Sections in Scientific Research. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4744–4749, Torino, Italy. ELRA and ICCL.", "dataset_id"=>2664, "selected_type"=>"Other", "datacite_list"=>"IsSupplementTo", "note"=>nil, "feature"=>nil} | 2024-05-20T18:12:28Z |
| RelatedMaterial | update: {"uri"=>["", "https://github.com/khan1792/texttransfer"], "uri_type"=>["", "URL"], "citation"=>["", "https://github.com/khan1792/texttransfer"], "datacite_list"=>["", "IsSupplementTo"]} | 2024-05-20T18:12:28Z |
| RelatedMaterial | update: {"uri"=>[nil, ""], "uri_type"=>[nil, ""], "datacite_list"=>[nil, ""], "note"=>[nil, ""], "feature"=>[nil, false]} | 2024-03-22T14:10:01Z |
| Dataset | update: {"subject"=>["", "Social Sciences"]} | 2024-03-22T14:10:01Z |
| RelatedMaterial | create: {"material_type"=>"Code", "availability"=>nil, "link"=>"https://github.com/khan1792/texttransfer", "uri"=>nil, "uri_type"=>nil, "citation"=>"", "dataset_id"=>2664, "selected_type"=>"Code", "datacite_list"=>nil, "note"=>nil, "feature"=>nil} | 2024-03-22T00:07:31Z |
| Funder | update: {"name"=>["Leibniz Institute for the German Language", "German Federal Ministry of Education and Research"], "grant"=>["", "01IO1634"]} | 2024-03-22T00:07:31Z |
| Dataset | update: {"keywords"=>["impact detection, project reports, annotation, mixed-methods, machine learning", "impact detection; project reports; annotation; mixed-methods; machine learning"], "version_comment"=>[nil, ""], "subject"=>[nil, ""]} | 2024-03-21T20:01:32Z |