Data for "Modeling the Global Citation Network using the Scalable Agent-based Simulator for Citation Analysis with Recency-emphasized Sampling (SASCA-ReS)"
Dataset Description |
This dataset principally consists of four synthetic citation networks that were generated during the preparation of the manuscript Park M, Yi H, Warnow T, and Chacko G (2025). Modeling the Global Citation Network using the Scalable Agent-based Simulator for Citation Analysis with Recency-emphasized Sampling (SASCA-ReS). A preprint is available on Zenodo (https://zenodo.org/records/17772113) and the manuscript has been submitted to the MetaRoR platform for review and feedback. The networks are roughly 14, 76, 161, and 218 million nodes each. Both nodelists with attributes and edge lists are provided as gzipped parquet files along with the configuration file that was passed to the SASCA-ReS software, which can be accessed at: https://github.com/illinois-or-research-analytics/SASCA-ReS. A copy of the configuration file that was used to generate the network with SASCA-ReS is also provided. For example: abm14_config.ini; abm14_edgelist.parquet.gz; and abm14_nodelist.parquet.gz. The column headers in the edgelists and nodelists and the fields in the configuration file are explained in the Github repository for SASCA-ReS. In addition, we provide sj_reccount, a table of real world citation frequencies that is an input to the SASCA-Res software. The first column (diff) of sj_reccount lists the difference between the publication year of a citing document and the publication year of a cited document. The second column (count) reports the frequency of such citations across the dataset of 77879427 observations, which is derived from the biomedical literature. Finally, we share data, composite_maverick_disruption.csv , from the mavericks (unconventional citing strategies) experiment reported in the Park et al. (2025) manuscript available at https://zenodo.org/records/17772113. The columns in the composite_maverick_disruption.csv file are: node_id -> of agents in the various simulations
|
Keywords |
synthetic networks; agent based models; SASCA-ReS; citation networks |
License |
CC BY |
Corresponding Creator |
George Chacko |
| Version | DOI | Comment | Publication Date |
|---|---|---|---|
| 1 | 10.13012/B2IDB-9265079_V1 | 2025-12-01 |
Contact the Research Data Service for help interpreting this log.