Displaying 1 - 25 of 726 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2025-02-08
 
The synthetic networks in this dataset were generated using the RECCS protocol developed by Anne et al. (2024). Briefly, the RECCS process is as follows. An input network and clustering (by any algorithm) is used to pass input parameters to a stochastic block model (SBM) generator. The output is then modified to improve fit to the input real world clusters after which outlier nodes are added using one of three different options. See Anne et al. (2024): in press Complex Networks and Applications XIII (preprint : arXiv:2408.13647). The networks in this dataset were generated using either version 1 or version 2 of the RECCS protocol followed by outlier strategy S1. The input networks to the process were (i) the Curated Exosome Network (CEN), Wedell et al. (2021), (ii) cit_hepph (https://snap.stanford.edu/), (iii) cit_patents (https://snap.stanford.edu/), and (iv) wiki_topcats (https://snap.stanford.edu/). Input Networks: The CEN can be downloaded from the Illinois Data Bank: https://databank.illinois.edu/datasets/IDB-0908742 -> cen_pipeline.tar.gz -> S1_cen_cleaned.tsv The synthetic file naming system should be interpreted as follows: a_b_c.tsv.gz where a - name of inspirational network, e.g., cit_hepph b - the resolution value used when clustering a with the Leiden algorithm optimizing the Constant Potts Model, e.g., 0.01 c- the RECCS option used to approximate edge count and connectivity in the real world network, e.g., v1 Thus, cit_hepph_0.01_v1.tsv indicates that this network was modeled on the cit_hepph network and RECCSv1 was used to match edge count and connectivity to a Leiden-CPM 0.01 clustering of cit_hepph. For SBM generation, we used the graph_tool software (P. Peixoto, Tiago 2014. The graph-tool python library. figshare. Dataset. https://doi.org/10.6084/m9.figshare.1164194.v14) Additionally, this dataset contains synthetic networks generated for a replication experiment (repl_exp.tar.gz). The experiment aims to evaluate the consistency of RECCS-generated networks by producing multiple replicates under controlled conditions. These networks were generated using different configurations of RECCS, varying across two versions (v1 and v2), and applying the Connectivity Modifier (CM++, Ramavarapu et al. (2024)) pre-processing. Please note that the CM pipeline used for this experiment filters small clusters both before and after the CM treatment. Input Network : CEN Within repl_exp.tar.gz, the synthetic file naming system should be interpreted as follows: cen_<resolution><cm_status><reccs_version>sample<replicate_id>.tsv where: cen – Indicates the network was modeled on the Curated Exosome Network (CEN). resolution – The resolution parameter used in clustering the input network with Leiden-CPM (0.01). cm_status – Either cm (CM-treated input clustering) or no_cm (input clustering without CM treatment). reccs_version – The RECCS version used to generate the synthetic network (v1 or v2). replicate_id – The specific replicate (ranging from 0 to 2 for each configuration). For example: cen_0.01_cm_v1_sample_0.tsv – A synthetic network based on CEN with Leiden-CPM clustering at resolution 0.01, CM-treated input, and generated using RECCSv1 (first replicate). cen_0.01_no_cm_v2_sample_1.tsv – A synthetic network based on CEN with Leiden-CPM clustering at resolution 0.01, without CM treatment, and generated using RECCSv2 (second replicate). The ground truth clustering input to RECCS is contained in repl_exp_groundtruths.tar.gz.
keywords: Community Detection; Synthetic Networks; Stochastic Block Model (SBM);
published: 2025-02-07
 
This dataset contains raw data of plasma glucose, insulin, c-peptide, GLP-1, and FGF21 collected as part of a study aimed to study alcohol pharmacokinetics in women who underwent metabolic surgery.
keywords: Excel; Alcohol and metabolic surgery; glucose; insulin; c-peptide; glp-1; fgf21
published: 2025-02-07
 
These data represent the raw data from the paper “Influence of light availability and water depth on competition between Phalaris arundinacea and herbaceous vines” published in Wetlands by Annie H. Huang and Jeffrey W. Matthews. The data are archived in one file: Huang&Matthews_mesocosm_data_archive. This file includes raw data collected during a greenhouse experiment described in the paper.
published: 2025-02-07
 
Incoherent scatter radar datasets collected during the September 2016 campaign at Arecibo have been deposited in this databank. The lag products of the ISR data are stored as lag profile matrices with 5 minutes of integration time. The data is organized in a Python dictionary format, with each file containing 12 lag profile matrices representing one hour of observation. A sample Python script is provided to illustrate its usage.
published: 2025-02-06
 
Data from a study on the behavior of blue-winged and golden-winged warblers. We were investigating vocalizations and how the species reconizes each other. There are banding, behavioral data from a playback study, and song data.
keywords: warblers; songs; species recognition
published: 2025-02-03
 
The data and code provided in this dataset can be used to generate plots that show the results of linear prediction algorithm and the amplified modes, supporting the key argument of the manuscript. It is divided into five subfolders, each corresponding to one combination of external condition (magnetic field B, temperature), scan parameter (temperature, magnetic field B), pump laser polarization (linear s, linear p, and circular), and sample orientation ( B parallel to c axis, B perpendicular to c axis): 1) B parallel to c axis, linear pump polarization in s, linear THz emission polarization in s, field dependence (B_parallel_c_linear_spump_sprobe_field). 2) B parallel to c axis, linear pump polarization in s, linear THz emission polarization in s, temperature dependence (B_parallel_c_linear_spump_sprobe_temperature). 3) B perpendicular to c axis, linear pump polarization in s, linear THz emission polarization in s, field dependence (B_perp_c_linear_spump_sprobe_field). 4) B perpendicular to c axis, linear pump polarization in s, linear THz emission polarization in s, temperature dependence (B_perp_c_linear_spump_sprobe_temperature). 5) B parallel to c axis, circular pump polarization (left circularly polarized LCP and right circularly polarized RCP), linear THz emission polarization in s, field dependence (B_parallel_c_LCPRCP_pump_sprobe_field). Each folder contains the raw data (.mat), the oscillator parameters obtained through linear prediction algorithm (.mat), and the plot-generating code (.m). The code plots the raw data, the fit to the processed data, and the amplified modes. Codes are written in MATLAB R2024a; the working directory of each code should be the corresponding subfolder that contains it.
keywords: magneto-chiral instability; THz emission; THz spectroscopy; nonequilibrium states; emergent phenomena; Weyl semiconductor; tellurium; ultrafast spectrscopy; photoexcitation
published: 2021-05-07
 
Prepared by Vetle Torvik 2021-05-07 The dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters). • How was the dataset created? The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in December, 2018. (NLMs baseline 2018 plus updates throughout 2018). Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only. However, MapAffil 2018 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220). Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records. All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in: Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p • Look for Fig. 4 in the following article for coverage statistics over time: Palmblad, M., Torvik, V.I. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Trop Med Health 45, 33 (2017). <a href="https://doi.org/10.1186/s41182-017-0073-6">https://doi.org/10.1186/s41182-017-0073-6</a> Expect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014. • The code and back-end data is periodically updated and made available for query by PMID at http://abel.ischool.illinois.edu/cgi-bin/mapaffil/search.py • What is the format of the dataset? The dataset contains 52,931,957 rows (plus a header row). Each row (line) in the file has a unique PMID and author order, and contains the following eighteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1. 1. PMID: positive non-zero integer; int(10) unsigned 2. au_order: positive non-zero integer; smallint(4) 3. lastname: varchar(80) 4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed 5. initial_2: middle name initial 6. orcid: From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 7. year: year of the publication 8. journal: name of journal that the publication is published 9. affiliation: author's affiliation?? 10. disciplines: extracted from departments, divisions, schools, laboratories, centers, etc. that occur on at least unique 100 affiliations across the dataset, some with standardization (e.g., 1770799), English translations (e.g., 2314876), or spelling corrections (e.g., 1291843) 11. grid: inferred using a high-recall technique focused on educational institutions (but, for experimental purposes, includes a few select hospitals, national institutes/centers, international companies, governmental agencies, and 200+ other IDs [RINGGOLD, Wikidata, ISNI, VIAF, http] for institutions not in GRID). Based on 2019 GRID version https://www.grid.ac/ 12. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK 13. city: varchar(200); typically 'city, state, country' but could include further subdivisions; unresolved ambiguities are concatenated by '|' 14. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA) 15. country 16. lat: at most 3 decimals (only available when city is not a country or state) 17. lon: at most 3 decimals (only available when city is not a country or state) 18. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find
keywords: PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution; institution name disambiguation
published: 2021-04-22
 
Author-ity 2018 dataset Prepared by Vetle Torvik Apr. 22, 2021 The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018). A total of 29.1 million Article records and 114.2 million author name instances. Each instance of an author name is uniquely represented by the PMID and the position on the paper (e.g., 10786286_3 is the third author name on PMID 10786286). Thus, each cluster is represented by a collection of author name instances. The instances were first grouped into "blocks" by last name and first name initial (including some close variants), and then each block was separately subjected to clustering. The resulting clusters are provided in two different formats, the first in a file with only IDs and PMIDs, and the second in a file with cluster summaries: #################### File 1: au2id2018.tsv #################### Each line corresponds to an author name instance (PMID and Author name position) with an Author ID. It has the following tab-delimited fields: 1. Author ID 2. PMID 3. Author name position ######################## File 2: authority2018.tsv ######################### Each line corresponds to a predicted author-individual represented by cluster of author name instances and a summary of all the corresponding papers and author name variants. Each cluster has a unique Author ID (the PMID of the earliest paper in the cluster and the author name position). The summary has the following tab-delimited fields: 1. Author ID (or cluster ID) e.g., 3797874_1 represents a cluster where 3797874_1 is the earliest author name instance. 2. cluster size (number of author name instances on papers) 3. name variants separated by '|' with counts in parenthesis. Each variant of the format lastname_firstname middleinitial, suffix 4. last name variants separated by '|' 5. first name variants separated by '|' 6. middle initial variants separated by '|' ('-' if none) 7. suffix variants separated by '|' ('-' if none) 8. email addresses separated by '|' ('-' if none) 9. ORCIDs separated by '|' ('-' if none). From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 10. range of years (e.g., 1997-2009) 11. Top 20 most frequent affiliation words (after stoplisting and tokenizing; some phrases are also made) with counts in parenthesis; separated by '|'; ('-' if none) 12. Top 20 most frequent MeSH (after stoplisting) with counts in parenthesis; separated by '|'; ('-' if none) 13. Journal names with counts in parenthesis (separated by '|'), 14. Top 20 most frequent title words (after stoplisting and tokenizing) with counts in parenthesis; separated by '|'; ('-' if none) 15. Co-author names (lowercased lastname and first/middle initials) with counts in parenthesis; separated by '|'; ('-' if none) 16. Author name instances (PMID_auno separated by '|') 17. Grant IDs (after normalization; '-' if none given; separated by '|'), 18. Total number of times cited. (Citations are based on references harvested from open sources such as PMC). 19. h-index 20. Citation counts (e.g., for h-index): PMIDs by the author that have been cited (with total citation counts in parenthesis); separated by '|'
keywords: author name disambiguation; PubMed
published: 2024-10-10
 
Diversity - PubMed dataset Contact: Apratim Mishra (Oct, 2024) This dataset presents article-level (pmid) and author-level (auid) diversity data for PubMed articles. The chosen selection includes articles retrieved from Authority 2018 [1], 907 024 papers, and 1 316 838 authors, and is an expanded dataset of V1. The sample of articles consists of the top 40 journals in the dataset, limited to 2-12 authors published between 1991 – 2014, which are article type "journal type" written in English. Files are 'gzip' compressed and separated by tab space, and V3 includes the correct author count for the included papers (pmids) and updated results with no NaNs. ################################################ File1: auids_plos_3.csv.gz (Important columns defined, 5 in total) • AUID: a unique ID for each author • Genni: gender prediction • Ethnea: ethnicity prediction ################################################# File2: pmids_plos_3.csv.gz (Important columns defined) • pmid: unique paper • auid: all unique auids (author-name unique identification) • year: Year of paper publication • no_authors: Author count • journal: Journal name • years: first year of publication for every author • Country-temporal: Country of affiliation for every author • h_index: Journal h-index • TimeNovelty: Paper Time novelty [2] • nih_funded: Binary variable indicating funding for any author • prior_cit_mean: Mean of all authors’ prior citation rate • Insti_impact: All unique institutions’ citation rate • mesh_vals: Top MeSH values for every author of that paper • relative_citation_ratio: RCR The ‘Readme’ includes a description for all columns. [1] Torvik, Vetle; Smalheiser, Neil (2021): Author-ity 2018 - PubMed author name disambiguated dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2273402_V1 [2] Mishra, Shubhanshu; Torvik, Vetle I. (2018): Conceptual novelty scores for PubMed articles. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5060298_V1
keywords: Diversity; PubMed; Citation
published: 2023-02-23
 
Coups d'État are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d'État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e. realized or successful coups, unrealized coup attempts, or thwarted conspiracies) the type of actor(s) who initiated the coup (i.e. military, rebels, etc.), as well as the fate of the deposed leader. This current version, Version 2.1.2, adds 6 additional coup events that occurred in 2022 and updates the coding of an attempted coup event in Kazakhstan in January 2022. Version 2.1.1 corrects a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixes this omission by marking the case as both a dissident coup and an auto-coup. Version 2.1.0 added 36 cases to the data set and removes two cases from the v2.0.0 data. This update also added actor coding for 46 coup events and adds executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event. Changes from the previously released data (v2.0.0) also include: 1. Adding additional events and expanding the period covered to 1945-2022 2. Filling in missing actor information 3. Filling in missing information on the outcomes for the incumbent executive 4. Dropping events that were incorrectly coded as coup events <br> <b>Items in this Dataset</b> 1. <i>Cline Center Coup d'État Codebook v.2.1.2 Codebook.pdf</i> - This 16-page document provides a description of the Cline Center Coup d’État Project Dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d’État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised February 2023</i> 2. <i>Coup Data v2.1.2.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 981 observations. <i>Revised February 2023</i> 3. <i>Source Document v2.1.2.pdf</i> - This 315-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised February 2023</i> 4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in markdown language. <i>Revised February 2023</i> <br> <b> Citation Guidelines</b> 1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2023. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.2. February 23. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V6 2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access): Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Emilio Soto. 2023. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.2. February 23. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V6
suppressed by curator
 
published: 2024-09-28
 
Per the authors' request, the data files for this dataset are now suppressed. Please visit this new dataset for the complete and updated data files: Huang, Yijing; Fahad , Mahmood (2025): Data for Observation of a Magneto-chiral Instability in Photoexcited Tellurium. University of Illinois Urbana-Champaign.<a href="https://doi.org/10.13012/B2IDB-1409842_V1">https://doi.org/10.13012/B2IDB-1409842_V1</a> ==================== The data and code provided in this dataset can be used to generate key plots in the manuscript. It is divided into four subfolders (B parallel/perpendicular to the tellurium c axis and field/ temperature dependence), each containing the raw data (saved in .mat format), the oscillator parameters obtained through linear prediction (saved in .mat format), and the plot-generating code (.m files). The code was written using MATLAB R2024a. To run the code, go to each folder, and run the .m file in that folder, which generates two plots.
published: 2025-01-31
 
Title: Airyscan confocal superresolution images of extant Malvaceae pollen with a focus on Bombacoideae Authors: Surangi W. Punyasena, Ingrid Romero, Michael A. Urban Subject: Biological sciences Keywords: Malvaceae; superresolution microscopy; Zeiss; Bombacacidites; Neotropics; CZI Funder: NSF-DBI Advances in Bioinformatics (NSF-DBI-1262561) Corresponding Creator: Surangi W. Punyasena This dataset includes a total of 430 images of extant specimens of the Malvaceae, with a focus on species that are or have been included within the subfamily Bombacoideae. There are 27 genera included within 26 folders. Each folder is named by genus and contains all the images that correspond to that genus. Note that the genus _Matisia_ is included with _Quararibea_ as detailed in the metadata READ ME file. The specimens imaged are from the palynological collections of the Swedish Museum of Natural History and Smithsonian Tropical Research Institute, and herbarium specimens from the Smithsonian Herbarium National Museum. The optical superresolution microscopy images were taken using a Zeiss LSM 880 with Airyscan at 630X magnification (63x/NA 1.4 oil DIC). The images are in the original CZI file format. They can be opened using Zeiss propriety software (Zen, Zen lite) or in ImageJ/FIJI. More information on how to open CZI files can be found here: [https://www.zeiss.com/microscopy/en/products/software/zeiss-zen/czi-image-file-format.html] Image metadata and file organization are described in the CSV file "METADATA_Malvaceae_Bombacoideae_modern-species.csv". The column headings are: Folder The folder in which the image file is found Subfamily The current subfamily determination based on the literature. Note that _Pentaplaris_ and _Septotheca_ have not been assigned a subfamily. Genus Genus name Species Species name Accepted name Accepted species name, updated from the literature Slide name Species name as denoted on the herbarium slide Collection Source of the herbarium slide: Sweden National Museum of Natural History or the Smithsonian Tropical Research Institute File name File name using the species name denoted on the herbarium slide Slide ID/Herbarium ID Specimen collection number Please cite this dataset as: Punyasena, Surangi W.; Romero, Ingrid; Urban, Michael A. (2025): Airyscan confocal superresolution images of extant Malvaceae pollen with a focus on Bombacoideae. University of Illinois Urbana-Champaign. https://doi.org/10.13012/B2IDB-2968712_V1
keywords: Malvaceae; superresolution microscopy; Zeiss; Bombacoideae; Neotropics; CZI
published: 2025-01-30
 
Coups d'Ètat are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d’État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e., realized, unrealized, or conspiracy) the type of actor(s) who initiated the coup (i.e., military, rebels, etc.), as well as the fate of the deposed leader. Version 2.2.0 adds 94 additional coup events. 66 of these came from examining Powell and Thyne’s “discarded” events and 28 of these events were added to the data set in the normal annual review of potential new coup events. This version also updates the coding to events in Brazil in 1945 and the Congo in 1968. Version 2.1.3 adds 19 additional coup events to the data set, corrects the date of a coup in Tunisia, and reclassifies an attempted coup in Brazil in December 2022 as a conspiracy. Version 2.1.2 added 6 additional coup events that occurred in 2022 and updated the coding of an attempted coup event in Kazakhstan in January 2022. Version 2.1.1 corrected a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixed this omission by marking the case as both a dissident coup and an auto-coup. Version 2.1.0 added 36 cases to the data set and removed two cases from the v2.0.0 data. This update also added actor coding for 46 coup events and added executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event. Version 2.0.0 improved several aspects of the previous version (v1.0.0) and incorporated additional source material to include: • Reconciling missing event data • Removing events with irreconcilable event dates • Removing events with insufficient sourcing (each event needs at least two sources) • Removing events that were inaccurately coded as coup events • Removing variables that fell below the threshold of inter-coder reliability required by the project • Removing the spreadsheet ‘CoupInventory.xls’ because of inadequate attribution and citations in the event summaries • Extending the period covered from 1945-2005 to 1945-2019 • Adding events from Powell and Thyne’s Coup Data (Powell and Thyne, 2011) Version 1.0.0 was released in 2013. This version consolidated coup data taken from the following sources: • The Center for Systemic Peace (Marshall and Marshall, 2007) • The World Handbook of Political and Social Indicators (Taylor and Jodice, 1983) • Coup d’Ètat: A Practical Handbook (Luttwak, 1979) • The Cline Center’s Social, Political and Economic Event Database (SPEED) Project (Nardulli, Althaus and Hayes, 2015) • Government Change in Authoritarian Regimes – 2010 Update (Svolik and Akcinaroglu, 2006) <br> <b>Items in this Dataset</b> 1. <i>Cline Center Coup d'État Codebook v.2.2.0 Codebook.pdf</i> - This 17-page document describes the Cline Center Coup d’État Project dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d'État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised January 2025</i> 2. <i>Coup Data v2.2.0.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 1094 observations. <i>Revised January 2025</i> 3. <i>Source Document v2.2.0.pdf</i> - This 347-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised January 2025</i> 4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in markdown language. <i>Revised January 2025</i> <br> <b> Citation Guidelines</b> 1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2025. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.2.0. Janurary 30. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V8 2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access): Peyton, Buddy, Joseph Bajjalieh, Michael Martin, Sam Alahi, Norah Fadell, and Maddie Jeralds. 2025. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.2.0. Janurary 30. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V8
published: 2025-01-30
 
This is a research data for a manuscript - A Framework of Simulating Structural Sediment Perimeter Barriers using VFSMOD.
keywords: sediment control
published: 2025-01-29
 
These data records weekly aphid and monarch butterfly (Danaus plexippus) neonate counts on individual milkweed plants in multiple raised garden beds in Chicago during the summers of 2023 and 2024. Relationships between aphid infestation and monarch neonates can be investigated along with weekly trends of monarch oviposition and aphid abundances. All gardens included in this study were on the University of Illinois Chicago campus, and within 100 meters of proximity. Data are provided on three milkweed species in 2023, and one milkweed species in 2024.
keywords: Aphis; Myzocallis; Danaus plexippus; urban gardens; Asclepias syriaca; milkweeds
published: 2020-05-13
 
Terrorism is among the most pressing challenges to democratic governance around the world. The Responsible Terrorism Coverage (or ResTeCo) project aims to address a fundamental dilemma facing 21st century societies: how to give citizens the information they need without giving terrorists the kind of attention they want. The ResTeCo hopes to inform best practices by using extreme-scale text analytic methods to extract information from more than 70 years of terrorism-related media coverage from around the world and across 5 languages. Our goal is to expand the available data on media responses to terrorism and enable the development of empirically-validated models for socially responsible, effective news organizations. This particular dataset contains information extracted from terrorism-related stories in the New York Times published between 1945 and 2018. It includes variables that measure the relative share of terrorism-related topics, the valence and intensity of emotional language, as well as the people, places, and organizations mentioned. This dataset contains 3 files: 1. <i>"ResTeCo Project NYT Dataset Variable Descriptions.pdf"</i> <ul> <li>A detailed codebook containing a summary of the Responsible Terrorism Coverage (ResTeCo) Project New York Times (NYT) Dataset and descriptions of all variables. </li> </ul> 2. <i>"resteco-nyt.csv"</i> <ul><li>This file contains the data extracted from terrorism-related media coverage in the New York Times between 1945 and 2018. It includes variables that measure the relative share of topics, sentiment, and emotion present in this coverage. There are also variables that contain metadata and list the people, places, and organizations mentioned in these articles. There are 53 variables and 438,373 observations. The variable "id" uniquely identifies each observation. Each observation represents a single news article. </li> <li> <b>Please note</b> that care should be taken when using "resteco-nyt.csv". The file may not be suitable to use in a spreadsheet program like Excel as some of the values get to be quite large. Excel cannot handle some of these large values, which may cause the data to appear corrupted within the software. It is encouraged that a user of this data use a statistical package such as Stata, R, or Python to ensure the structure and quality of the data remains preserved.</li> </ul> 3. <i>"README.md"</i> <ul><li>This file contains useful information for the user about the dataset. It is a text file written in mark down language</li> </ul> <b>Citation Guidelines</b> 1) To cite this codebook please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project New York Times (NYT) Dataset Variable Descriptions. Responsible Terrorism Coverage (ResTeCo) Project New York Times Dataset. Cline Center for Advanced Social Research. May 13. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-4638196_V1 2) To cite the data please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project New York Times Dataset. Cline Center for Advanced Social Research. May 13. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-4638196_V1
keywords: Terrorism, Text Analytics, News Coverage, Topic Modeling, Sentiment Analysis
published: 2025-01-29
 
Hype - PubMed dataset Prepared by Apratim Mishra This dataset captures ‘Hype’ within biomedical abstracts sourced from PubMed. The selection chosen is ‘journal articles’ written in English, published between 1975 and 2019, totaling ~5.2 million. The classification relies on the presence of specific candidate ‘hype words’ and their abstract location. Therefore, each article (PMID) might have multiple instances in the dataset due to the presence of multiple hype words in different abstract sentences. The candidate hype words are 35 in count: 'major', 'novel', 'central', 'critical', 'essential', 'strongly', 'unique', 'promising', 'markedly', 'excellent', 'crucial', 'robust', 'importantly', 'prominent', 'dramatically', 'favorable', 'vital', 'surprisingly', 'remarkably', 'remarkable', 'definitive', 'pivotal', 'innovative', 'supportive', 'encouraging', 'unprecedented', 'enormous', 'exceptional', 'outstanding', 'noteworthy', 'creative', 'assuring', 'reassuring', 'spectacular', and 'hopeful’. This is version 2 of the dataset. Changes include: Added “Year” variable. Removed “Abstract length” variable. Modified variable information due to updated probabilistic model of hype. Number of hype words - 35 (updated from 36 based on revised findings). File 1: hype_dataset_final.tsv Primary dataset. It has the following columns: 1. PMID: represents unique article ID in PubMed 2. Year: Year of publication 3. Hype_word: Candidate hype word, such as ‘novel.’ 4. Sentence: Sentence in abstract containing the hype word. 5. Hype_percentile: Abstract relative position of hype word. 6. Hype_value: Propensity of hype based on the hype word, the sentence, and the abstract location. 7. Introduction: The ‘I’ component of the hype word based on IMRaD 8. Methods: The ‘M’ component of the hype word based on IMRaD 9. Results: The ‘R’ component of the hype word based on IMRaD 10. Discussion: The ‘D’ component of the hype word based on IMRaD File 2: hype_removed_phrases_final.tsv Secondary dataset with same columns as File 1. Hype in the primary dataset is based on excluding certain phrases that are rarely hype. The phrases that were removed are included in File 2 and modeled separately. Removed phrases: 1. Major: histocompatibility, component, protein, metabolite, complex, surgery 2. Novel: assay, mutation, antagonist, inhibitor, algorithm, technique, series, method, hybrid 3. Central: catheters, system, design, composite, catheter, pressure, thickness, compartment 4. Critical: compartment, micelle, temperature, incident, solution, ischemia, concentration, thinking, nurses, skills, analysis, review, appraisal, evaluation, values 5. Essential: medium, features, properties, opportunities, oil 6. Unique: model, amino 7. Robust: regression 8. Vital: capacity, signs, organs, status, structures, staining, rates, cells, information 9. Outstanding: questions, issues, question, questions, challenge, problems, problem, remains 10. Remarkable: properties 11. Definite: radiotherapy, surgery
keywords: Hype; PubMed; Abstracts; Biomedicine
published: 2025-01-26
 
Data and code supporting the paper titled "Leveraging electric vehicles as a resiliency solution for residential backup power during outages" by Shanshan Liu, Alex Vlachokostas, and Eleftheria Kontou. The data and the code enable spatiotemporal analytics and assessment of electric vehicle charging demand, remaining driving range, residential energy use, and vehicle-to-home (V2H) energy system resilience metrics.
keywords: Electric vehicles; Power outages; Vehicle-to-home energy system; Residential loads; Bidirectional energy exchange
published: 2025-01-27
 
The zip file contains the benchmark data used for the TIPP3 simulation study. See the README file for more information.
keywords: TIPP3;abundance profile;reference database;taxonomic identification;simulation
published: 2025-01-27
 
This is the core data for RELIX, a dataset of vascular plant species presence for 353 prairie remnants in the Midwestern United States and associated dataset of prairie remnant metadata. The primary data file contains a list of the vascular plant species observed in the prairie remnants, as well as a metadata table with more information about the prairie remnant in question and the species list itself. The data was compiled from a variety of written sources, private and published, chronicling observations made between the mid-twentieth century and 2021. It also contains a supplementary data table of vascular plant species observed in at least 8 of the prairie remnants in RELIX, as well as a list of acknowledgements for the associated manuscript.
keywords: prairie peninsula; prairie relict; prairie soil; species inventories; tallgrass prairie
published: 2024-04-10
 
This dataset provides estimates of total Irrigation Water Use (IWU) by crop, county, water source, and year for the Continental United States. Total irrigation from Surface Water Withdrawals (SWW), total Groundwater Withdrawals (GWW), and nonrenewable Groundwater Depletion (GWD) is provided for 20 crops and crop groups from 2008 to 2020 at the county spatial resolution. In total, there are nearly 2.5 million data points in this dataset (3,142 counties; 13 years; 3 water sources; and 20 crops). This dataset supports the paper by Ruess et al (2024) "Total irrigation by crop in the Continental United States from 2008 to 2020", Scientific Data, doi: 10.1038/s41597-024-03244-w When using, please cite as: Ruess, P.J., Konar, M., Wanders, N., and Bierkens, M.F.P. (2024) Total irrigation by crop in the Continental United States from 2008 to 2020, Scientific Data, doi: 10.1038/s41597-024-03244-w
keywords: water use; irrigation; surface water; groundwater; groundwater depletion; counties; crops; time series
published: 2024-10-31
 
School buses transport 20 million students annually and are currently undergoing electrification in the US. With Vehicle-to-Building (V2B) technology, electric school buses (ESBs) can supply energy to school buildings during power outages, ensuring continued operation and safety. This study proposes assessing the resilience of secondary schools during outages by leveraging ESB fleets as backup power across various US climate regions. The findings indicate that the current fleet of ESBs in representative cities across different climate regions in the US is insufficient to meet the power demands of an entire school or even its HVAC system. However, we estimated the number of ESBs required to support the school's power needs, and we showed that the use of V2B technology significantly reduces carbon emissions compared to backup diesel generators. While adjusting HVAC setpoints and installing solar panels have limited impacts on enhancing school resilience, gathering students in classrooms during outages significantly improved resilience in our case study in Houston, Texas. Given the ongoing electrification of school buses, it is essential for schools to complement ESBs with stationary batteries and other backup power sources, such as solar and/or diesel generators, to effectively address prolonged outages. Determining the deployment of direct current fast and Level 2 chargers can reduce infrastructure costs while maintaining the resilience benefits of ESBs. This dataset includes the simulation process and results of this study.
keywords: Electric school bus; Power outages,;Vehicle-to-Building technology; Carbon emission reduction; Backup power source
published: 2025-01-23
 
These are the responses to an open, convenience sample survey of residents of Illinois to understand their interactions with wild deer. The survey was available on REDCap between December 19, 2022 and December 19, 2023, and was publicized through listserves, Facebook groups, and media reporting. The file "COVID Deer Survey _ REDCap.pdf" contains the codebook for the survey, including the questions; all factor variables have ".factor" added to their name in the dataset. The file "DeerSurveyData.csv" contains the dataset. The file "Score_calculation_for_sharing.R" is the code to create the cleaned dataset used for analysis from the raw survey responses. Throughout, NA is used to represent null/not available/not applicable; this is most likely either a failure to answer the question or, in some cases, a question that was not presented as it is not relevant based on answers to previous questions.
keywords: deer; survey
published: 2021-05-17
 
Please cite as: Wuebbles, D., J. Angel, K. Petersen, and A.M. Lemke, (Eds.), 2021: An Assessment of the Impacts of Climate Change in Illinois. The Nature Conservancy, Illinois, USA. https://doi.org/10.13012/B2IDB-1260194_V1 Climate change is a major environmental challenge that is likely to affect many aspects of life in Illinois, ranging from human and environmental health to the economy. Illinois is already experiencing impacts from the changing climate and, as climate change progresses and temperatures continue to rise, these impacts are expected to increase over time. This assessment takes an in-depth look at how the climate is changing now in Illinois, and how it is projected to change in the future, to provide greater clarity on how climate change could affect urban and rural communities in the state. Beyond providing an overview of anticipated climate changes, the report explores predicted effects on hydrology, agriculture, human health, and native ecosystems.
keywords: Climate change; Illinois; Public health; Agriculture; Environment; Water; Hydrology; Ecosystems