To generate the bibliographic and survey data to support a data reuse study conducted by several Library faculty and accepted for publication in the Journal of Academic Librarianship, the project team utilized a series of web-based online scripts that employed several different endpoints from the Scopus API. The related dataset: "Data for: An Examination of Data Reuse Practices within Highly Cited Articles of Faculty at a Research University" contains survey design and results.
1) getScopus_API_process_dmp_IDB.asp: used the search API query the Scopus database API for papers by UIUC authors published in 2015 -- limited to one of 9 pre-defined Scopus subject areas -- and retrieve metadata results sorted highest to lowest by the number of times the retrieved articles were cited. The URL for the basic searches took the following form: https://api.elsevier.com/content/search/scopus?query=(AFFIL%28(urbana%20OR%20champaign) AND univ*%29) OR (AF-ID(60000745) OR AF-ID(60005290))&apikey=xxxxxx&start=" & nstart & "&count=25&date=2015&view=COMPLETE&sort=citedby-count&subj=PHYS
Here, the variable nstart was incremented by 25 each iteration and 25 records were retrieved in each pass. The subject area was renamed (e.g. from PHYS to COMP for computer science) in each of the 9 runs. This script does not use the Scopus API cursor but downloads 25 records at a time for up to 28 times -- or 675 maximum bibliographic records. The project team felt that looking at the most 675 cited articles from UIUC faculty in each of the 9 subject areas was sufficient to gather a robust, representative sample of articles from 2015. These downloaded records were stored in a temporary table that was renamed for each of the 9 subject areas.
2) get_citing_from_surveys_IDB.asp: takes a Scopus article ID (eid) from the 49 UIUC author returned surveys and retrieves short citing article references, 200 at a time, into a temporary composite table. These citing records contain only one author, no author affiliations, and no author email addresses. This script uses the Scopus API cursor=* feature and is able to download all the citing references of an article 200 records at a time.
3) put_in_all_authors_affil_IDB.asp: adds important data to the short citing records. The script adds all co-authors and their affiliations, the corresponding author, and author email addresses.
4) process_for_final_IDB.asp: creates a relational database table with author, title, and source journal information for each of the citing articles that can be copied as an Excel file for processing by the Qualtrics survey software. This was initially 4,626 citing articles over the 49 UIUC authored articles, but was reduced to 2,041 entries after checking for available email addresses and eliminating duplicates.