Displaying 1 - 25 of 699 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2024-11-14
 
These data represent the raw data from the paper “The invasion of Japanese hop (Humulus japonicus) in a restored floodplain forest” published in Invasive Plant Science and Management by Annie H. Huang and Jeffrey W. Matthews.
keywords: invasive plants; restored wetlands
published: 2024-11-14
 
These data are social media posts on Facebook and Twitter, as identified by SCOPES and healthfeedback.org as misinformation. We independently pulled social media data using Brandwatch’s (previously Crimson Hexagon) historical Twitter database and CrowdTangle, a public insights tool owned and operated by Facebook. Each of these databases only store publicly tagged posts and both databases have been used as Twitter and Facebook data sources in previous academic research studies (see, for example, Yun, Pamuksuz, and Duff 2019; Jernigan and Rushman 2014). The period on which we searched was January 1, 2020, to March 31, 2021. The original misinformation links were screenshots of posts or memes, links to native Facebook, Twitter, or Reddit posts and links to articles/websites containing misinformation.These links were passed through CrowdTangle to verify that they were not labeled. This process gave us a dataset of posts of unlabeled misinformation links. We found 12,184 instances of HF’s COVID-19 misinformation links being shared on Twitter versus 6,388 instances of the same links being shared on Facebook.
keywords: Covid-19; Facebook; Twitter; Social Media: Misinformation; Labelling
published: 2024-11-13
 
These datasets are for the four-dimensional scanning transmission electron microscopy (4D-STEM) and electron energy loss spectroscopy (EELS) experiments for cathode nanoparticles at different states. The raw 4D-STEM experiment datasets were collected by TEM image & analysis software (FEI) and were saved as SER files. The raw 4D-STEM datasets of SER files can be opened and viewed in MATLAB using our analysis software package of imToolBox available at https://github.com/flysteven/imToolBox. The raw EELS datasets were collected by DigitalMicrograph software and were saved as DM4 files. The raw EELS datasets can be opened and viewed in DigitalMicrograph software or using our analysis codes available at https://github.com/chenlabUIUC/OrientedPhaseDomain. All the datasets are from the work "Nanoscale Stacking Fault Engineering and Mapping in Spinel Oxides for Reversible Multivalent Ion Insertion" (2024). The 4D-STEM experiment data include four example datasets for cathode nanoparticles collected at pristine and discharged states. Each dataset contains a stack of diffraction patterns collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine untreated nanoparticle: "Pristine U-NP.ser" 2. Pristine 200ºC heated nanoparticle: "Pristine H200-NP.ser" 3. Untreated nanoparticle after first discharge in Zn-ion batteries: "Discharged U-NP.ser" 4. 200ºC heated nanoparticle after first discharge in Zn-ion batteries: "Discharged H200-NP.ser" The EELS experiment data includes six example datasets for cathode nanoparticles collected at different states (in "EELS datasets.zip") as described below. Each EELS dataset contains the zero-loss and core-loss EELS spectra collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine untreated nanoparticle: "Pristine U-NP EELS.zip" 2. Pristine 200ºC heated nanoparticle: "Prisitne H200-NP EELS.zip" 3. Untreated nanoparticle after first discharge in Zn-ion batteries: "Discharged U-NP EELS.zip" 4. Untreated nanoparticle after first charge in Zn-ion batteries: "Charged U-NP EELS.zip" 5. 200ºC heated nanoparticle after first discharge in Zn-ion batteries: "Discharged H200-NP EELS.zip" 6. 200ºC heated nanoparticle after first charge in Zn-ion batteries: "Charged H200-NP EELS.zip" The details of the software package and codes that can be used to analyze the 4D-STEM datasets and EELS datasets are available at: https://github.com/chenlabUIUC/OrientedPhaseDomain. Once our paper is formally published, we will update the relationship of these datasets with our paper.
keywords: 4D-STEM; EELS; defects; strain; cathode; nanoparticle; energy storage
published: 2024-11-12
 
This is the data set for the article entitled "Pollinator seed mixes are phenologically dissimilar to prairie remnants," a manuscript pending publication in Restoration Ecology. This represents the core phenology data of prairie remnant and pollinator seed mixes that were used for the main analyses. Note that additional data associated with the manuscript are intended to be published as a supplement in the journal.
keywords: native plants; ecological restoration; tallgrass prairie; native plant materials
published: 2021-04-22
 
Author-ity 2018 dataset Prepared by Vetle Torvik Apr. 22, 2021 The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018). A total of 29.1 million Article records and 114.2 million author name instances. Each instance of an author name is uniquely represented by the PMID and the position on the paper (e.g., 10786286_3 is the third author name on PMID 10786286). Thus, each cluster is represented by a collection of author name instances. The instances were first grouped into "blocks" by last name and first name initial (including some close variants), and then each block was separately subjected to clustering. The resulting clusters are provided in two different formats, the first in a file with only IDs and PMIDs, and the second in a file with cluster summaries: #################### File 1: au2id2018.tsv #################### Each line corresponds to an author name instance (PMID and Author name position) with an Author ID. It has the following tab-delimited fields: 1. Author ID 2. PMID 3. Author name position ######################## File 2: authority2018.tsv ######################### Each line corresponds to a predicted author-individual represented by cluster of author name instances and a summary of all the corresponding papers and author name variants. Each cluster has a unique Author ID (the PMID of the earliest paper in the cluster and the author name position). The summary has the following tab-delimited fields: 1. Author ID (or cluster ID) e.g., 3797874_1 represents a cluster where 3797874_1 is the earliest author name instance. 2. cluster size (number of author name instances on papers) 3. name variants separated by '|' with counts in parenthesis. Each variant of the format lastname_firstname middleinitial, suffix 4. last name variants separated by '|' 5. first name variants separated by '|' 6. middle initial variants separated by '|' ('-' if none) 7. suffix variants separated by '|' ('-' if none) 8. email addresses separated by '|' ('-' if none) 9. ORCIDs separated by '|' ('-' if none). From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 10. range of years (e.g., 1997-2009) 11. Top 20 most frequent affiliation words (after stoplisting and tokenizing; some phrases are also made) with counts in parenthesis; separated by '|'; ('-' if none) 12. Top 20 most frequent MeSH (after stoplisting) with counts in parenthesis; separated by '|'; ('-' if none) 13. Journal names with counts in parenthesis (separated by '|'), 14. Top 20 most frequent title words (after stoplisting and tokenizing) with counts in parenthesis; separated by '|'; ('-' if none) 15. Co-author names (lowercased lastname and first/middle initials) with counts in parenthesis; separated by '|'; ('-' if none) 16. Author name instances (PMID_auno separated by '|') 17. Grant IDs (after normalization; '-' if none given; separated by '|'), 18. Total number of times cited. (Citations are based on references harvested from open sources such as PMC). 19. h-index 20. Citation counts (e.g., for h-index): PMIDs by the author that have been cited (with total citation counts in parenthesis); separated by '|'
keywords: author name disambiguation; PubMed
published: 2024-06-04
 
This dataset contains files and relevant metadata for real-world and synthetic LFR networks used in the manuscript "Well-Connectedness and Community Detection (2024) Park et al. presently under review at PLOS Complex Systems. The manuscript is an extended version of Park, M. et al. (2024). Identifying Well-Connected Communities in Real-World and Synthetic Networks. In Complex Networks & Their Applications XII. COMPLEX NETWORKS 2023. Studies in Computational Intelligence, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-031-53499-7_1. “The Overview of Real-World Networks image provides high-level information about the seven real-world networks. TSVs of the seven real-world networks are provided as [network-name]_cleaned to indicate that duplicated edges and self-loops were removed, where column 1 is source and column 2 is target. LFR datasets are contained within the zipped file. Real-world networks are labeled _cleaned_ to indicate that duplicate edges and self loops were removed. #LFR datasets for the Connectivity Modifier (CM) paper ### File organization Each directory `[network-name]_[resolution-value]_lfr` includes the following files: * `network.dat`: LFR network edge-list * `community.dat`: LFR ground-truth communities * `time_seed.dat`: time seed used in the LFR software * `statistics.dat`: statistics generated by the LFR software * `cmd.stat`: command used to run the LFR software as well as time and memory usage information
published: 2024-11-07
 
This dataset consists of the 286 publications retrieved from Web of Science and Scopus on July 6, 2023 as citations for Willoughby et al., 2014: Patrick H. Willoughby, Matthew J. Jansma, and Thomas R. Hoye (2014). A guide to small-molecule structure assignment through computation of (¹H and ¹³C) NMR chemical shifts. Nature Protocols, 9(3), Article 3. https://doi.org/10.1038/nprot.2014.042 We added the DOIs of the citing publications into a Zotero collection. Then we exported all 286 DOIs in two formats: a .csv file (data export) and an .rtf file (bibliography). <b>Willoughby2014_286citing_publications.csv</b> is a Zotero data export of the citing publications. <b>Willoughby2014_286citing_publications.rtf</b> is a bibliography of the citing publications, using a variation of the American Psychological Association style (7th edition) with full names instead of initials. To create <b>Willoughby2014_citation_contexts.csv</b>, HZ manually extracted the paragraphs that contain a citation marker of Willoughby et al., 2014. We refer to these paragraphs as the citation contexts of Willoughby et al., 2014. Manual extraction started with 286 citing publications but excluded 2 publications that are not in English, those with DOIs 10.13220/j.cnki.jipr.2015.06.004 and 10.19540/j.cnki.cjcmm.20200604.201 The silver standard aimed to triage the citing publications of Willoughby et al., 2014 that are at risk of propagating unreliability due to a code glitch in a computational chemistry protocol introduced in Willoughby et al., 2014. The silver standard was created stepwise: First one chemistry expert (YF) manually annotated the corpus of 284 citing publications in English, using their full text and citation contexts. She manually categorized publications as either at risk of propagating unreliability or not at risk of propagating unreliability, with a rationale justifying each category. Then we selected a representative sample of citation contexts to be double annotated. To do this, MJS turned the full dataset of citation contexts (Willoughby2014_citation_contexts.csv) into word embeddings, clustered them using similarity measures using BERTopic's HDBS, and selected representative citation contexts based on the centroids of the clusters. Next the second chemistry expert (EV) annotated the 77 publications associated with the citation contexts, considering the full text as well as the citation contexts. <b>double_annotated_subset_77_before_reconciliation.csv</b> provides EV and YF's annotation before reconciliation. To create the silver standard YF, EV, and JS discussed differences and reconciled most differences. YF and EV had principled reasons for disagreeing on 9 publications; to handle these, YF updated the annotations, to create the silver standard we use for evaluation in the remainder of our JCDL 2024 paper (<b>silver_standard.csv</b>) <b>Inter_Annotator_Agreement.xlsx</b> indicates publications where the two annotators made opposite decisions and calculates the inter-annotator agreement before and after reconciliation together. <b>double_annotated_subset_77_before_reconciliation.csv</b> provides EV and YF's annotation after reconciliation, including applying the reconciliation policy.
keywords: unreliable cited sources; knowledge maintenance; citations; scientific digital libraries; scholarly publications; reproducibility; unreliability propagation; citation contexts
published: 2023-02-23
 
Coups d'État are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d'État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e. realized or successful coups, unrealized coup attempts, or thwarted conspiracies) the type of actor(s) who initiated the coup (i.e. military, rebels, etc.), as well as the fate of the deposed leader. This current version, Version 2.1.2, adds 6 additional coup events that occurred in 2022 and updates the coding of an attempted coup event in Kazakhstan in January 2022. Version 2.1.1 corrects a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixes this omission by marking the case as both a dissident coup and an auto-coup. Version 2.1.0 added 36 cases to the data set and removes two cases from the v2.0.0 data. This update also added actor coding for 46 coup events and adds executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event. Changes from the previously released data (v2.0.0) also include: 1. Adding additional events and expanding the period covered to 1945-2022 2. Filling in missing actor information 3. Filling in missing information on the outcomes for the incumbent executive 4. Dropping events that were incorrectly coded as coup events <br> <b>Items in this Dataset</b> 1. <i>Cline Center Coup d'État Codebook v.2.1.2 Codebook.pdf</i> - This 16-page document provides a description of the Cline Center Coup d’État Project Dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d’État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised February 2023</i> 2. <i>Coup Data v2.1.2.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 981 observations. <i>Revised February 2023</i> 3. <i>Source Document v2.1.2.pdf</i> - This 315-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised February 2023</i> 4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in markdown language. <i>Revised February 2023</i> <br> <b> Citation Guidelines</b> 1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2023. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.2. February 23. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V6 2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access): Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Emilio Soto. 2023. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.2. February 23. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V6
published: 2024-11-07
 
This dataset is part of a genome annoucement. The main folder PROKKA_results contain nine Prokka v.1.14.6 annotation files from nine Clostridium scindens genome sequences. Each file provide 12 output files including predicted protein sequences (.faa), nucleotide sequences of the predicted coding regions (.ffn), nucleotide sequence of the genome (.fna and .fsa), annotated genome in GenBank format (.gbk), steps recording performed during the annotation process (.log), error messages or warnings (.err), annotations in Sequin format (.sqn), summary of the annotations in tabular (.tbl), tab-separated values (.tsv) and plain text (.txt) formats.
keywords: Clostridium scindens; genome annotation; PROKKA;
published: 2021-05-17
 
Please cite as: Wuebbles, D., J. Angel, K. Petersen, and A.M. Lemke, (Eds.), 2021: An Assessment of the Impacts of Climate Change in Illinois. The Nature Conservancy, Illinois, USA. https://doi.org/10.13012/B2IDB-1260194_V1 Climate change is a major environmental challenge that is likely to affect many aspects of life in Illinois, ranging from human and environmental health to the economy. Illinois is already experiencing impacts from the changing climate and, as climate change progresses and temperatures continue to rise, these impacts are expected to increase over time. This assessment takes an in-depth look at how the climate is changing now in Illinois, and how it is projected to change in the future, to provide greater clarity on how climate change could affect urban and rural communities in the state. Beyond providing an overview of anticipated climate changes, the report explores predicted effects on hydrology, agriculture, human health, and native ecosystems.
keywords: Climate change; Illinois; Public health; Agriculture; Environment; Water; Hydrology; Ecosystems
published: 2024-10-28
 
This dataset contains MALDI imaging and fluorescence imaging data of 5xFAD mice and control animals. 1+2) Animal_1_5xFAD_s1 and s2 : A MATLAB file of 50 micron spatial resolution imaging of whole brain slice from a 5xFAD animal. 3) Slide28_Animal1_stitch_channels__Thioflavin S : A PNG file of the corresponding Thioflavin S- stained fluorescence image obtained post-MSI from the same section. 4) Slide28_Animal1_stitch_merged : A PNG file of the corresponding merged imaged including brightfield, Thioflavin S (GFP channel) and Hoechst staining (DAPI channel) used for image registration 5) mz_bins_use_neg.mat : A MATLAB array of the m/z channels all MSI images (whole brain slice, 50 micron spatial resolution) were binned to in order to enable comparison 6) Animal3_S18_HR.mat : A MATLAB array of high-spatial-resolution (5 micron) imaging of a 5xFAD mouse hippocampus and cortex. Due to the large dataset, 22 m/z channels are included. 7) Animal5_S18_HR.mat : A MATLAB array of high-spatial-resolution (5 micron) imaging of a wildtype mouse hippocampus and cortex 8) mz_features_22.mat : A MATLAB array of the 22 m/z channels included in the high spatial resolution imaging data
keywords: amyloid beta; 5xfad, lipids; maldi;
published: 2024-07-30
 
This file contains the white-tailed deer (Odocoileus virginianus) land cover utility score (deer LCU score) datasets for every TRS (township, range, and section), township, and county in Illinois, USA. The file is an Excel spreadsheet with a metadata sheet, separate sheets for the deer LCU scores for each spatial level, and a sheet with the data required to replicate how the deer LCU score approach was validated. The deer LCU score is a unitless value, with larger scores corresponding to a spatial unit with more and/or better deer habitat.
keywords: habitat; white-tailed deer; deer; Odocoileus virginianus; land cover; land classification; landscape; habitat suitability index; ecology; environment
published: 2017-10-11
 
The International Registry of Reproductive Pathology Database is part of pioneering work done by Dr. Kenneth McEntee to comprehensively document thousands of disease cases studies. His large and comprehensive collection of case reports and physical samples was complimented by development of the International Registry of Reproductive Pathology Database in the 1980s. The original FoxPro Database files and a migrated access version were completed by the College of Veterinary Medicine in 2016. Access CSV files were completed by the University of Illinois Library in 2017.
keywords: Animal Pathology; Databases; Veterinary Medicine
published: 2024-11-01
 
This dataset includes data on soil nitrous oxide fluxes, soil properties, and climate presented in the manuscript, "A conceptual model explaining spatial variation in soil nitrous oxide emissions in agricultural fields," published in Commucations Earth & Environment. Please refer to that publication for details about methodologies used to generate these data and for the experimental design.
keywords: soil nitrous oxide emissions; gross nitrous oxide production; gross nitrous oxide consumption; N2O; denitrification; maize; cannon model
published: 2024-10-31
 
School buses transport 20 million students annually and are currently undergoing electrification in the US. With Vehicle-to-Building (V2B) technology, electric school buses (ESBs) can supply energy to school buildings during power outages, ensuring continued operation and safety. This study proposes assessing the resilience of secondary schools during outages by leveraging ESB fleets as backup power across various US climate regions. The findings indicate that the current fleet of ESBs in representative cities across different climate regions in the US is insufficient to meet the power demands of an entire school or even its HVAC system. However, we estimated the number of ESBs required to support the school's power needs, and we showed that the use of V2B technology significantly reduces carbon emissions compared to backup diesel generators. While adjusting HVAC setpoints and installing solar panels have limited impacts on enhancing school resilience, gathering students in classrooms during outages significantly improved resilience in our case study in Houston, Texas. Given the ongoing electrification of school buses, it is essential for schools to complement ESBs with stationary batteries and other backup power sources, such as solar and/or diesel generators, to effectively address prolonged outages. Determining the deployment of direct current fast and Level 2 chargers can reduce infrastructure costs while maintaining the resilience benefits of ESBs. This dataset includes the simulation process and results of this study.
keywords: Electric school bus; Power outages,;Vehicle-to-Building technology; Carbon emission reduction; Backup power source
published: 2024-10-25
 
This is a reference package to be used with the TIPP3 software for abundance profiling of metagenomic reads sampled from a microbial community. TIPP3 software: https://github.com/c5shen/TIPP3 Usage: 1. unzip the file to a local directory (will get a folder named "tipp3-refpkg"). 2. use with TIPP3 software: `tipp3.py -r [path/to/tipp3-refpkg] [other parameters]`
keywords: TIPP3; abundance profile; reference database; taxonomic identification
published: 2021-08-28
 
Metabolite identifications and profiles of liver samples from 22 day old male and female pigs from gilt that exposed to porcine reproductive and respiratory syndrome virus (P) or not (C) that were weaned at 21 days of age (W) or not (N). Profiles were obtained by University of Illinois Carver Metabolomics Center. Spectrum for each sample was acquired using a gas chromatography mass spectrometry system consisting of an Agilent 7890 gas chromatograph, an Agilent 5975 MSD, and an HP 7683B auto sampler.
keywords: gas chromatography; mass spectrometry; maternal immune activation; weaning; liver
published: 2022-06-20
 
This is a sentence-level parallel corpus in support of research on OCR quality. The source data comes from: (1) Project Gutenberg for human-proofread "clean" sentences; and, (2) HathiTrust Digital Library for the paired sentences with OCR errors. In total, this corpus contains 167,079 sentence pairs from 189 sampled books in four domains (i.e., agriculture, fiction, social science, world war history) published from 1793 to 1984. There are 36,337 sentences that have two OCR views paired with each clean version. In addition to sentence texts, this corpus also provides the location (i.e., sentence and chapter index) of each sentence in its belonging Gutenberg volume.
keywords: sentence-level parallel corpus; optical character recognition; OCR errors; Project Gutenberg; HathiTrust Digital Library; digital libraries; digital humanities;
published: 2021-05-07
 
Prepared by Vetle Torvik 2021-05-07 The dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters). • How was the dataset created? The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in December, 2018. (NLMs baseline 2018 plus updates throughout 2018). Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only. However, MapAffil 2018 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220). Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records. All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in: Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p • Look for Fig. 4 in the following article for coverage statistics over time: Palmblad, M., Torvik, V.I. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Trop Med Health 45, 33 (2017). <a href="https://doi.org/10.1186/s41182-017-0073-6">https://doi.org/10.1186/s41182-017-0073-6</a> Expect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014. • The code and back-end data is periodically updated and made available for query by PMID at http://abel.ischool.illinois.edu/cgi-bin/mapaffil/search.py • What is the format of the dataset? The dataset contains 52,931,957 rows (plus a header row). Each row (line) in the file has a unique PMID and author order, and contains the following eighteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1. 1. PMID: positive non-zero integer; int(10) unsigned 2. au_order: positive non-zero integer; smallint(4) 3. lastname: varchar(80) 4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed 5. initial_2: middle name initial 6. orcid: From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 7. year: year of the publication 8. journal: name of journal that the publication is published 9. affiliation: author's affiliation?? 10. disciplines: extracted from departments, divisions, schools, laboratories, centers, etc. that occur on at least unique 100 affiliations across the dataset, some with standardization (e.g., 1770799), English translations (e.g., 2314876), or spelling corrections (e.g., 1291843) 11. grid: inferred using a high-recall technique focused on educational institutions (but, for experimental purposes, includes a few select hospitals, national institutes/centers, international companies, governmental agencies, and 200+ other IDs [RINGGOLD, Wikidata, ISNI, VIAF, http] for institutions not in GRID). Based on 2019 GRID version https://www.grid.ac/ 12. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK 13. city: varchar(200); typically 'city, state, country' but could include further subdivisions; unresolved ambiguities are concatenated by '|' 14. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA) 15. country 16. lat: at most 3 decimals (only available when city is not a country or state) 17. lon: at most 3 decimals (only available when city is not a country or state) 18. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find
keywords: PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution; institution name disambiguation
published: 2024-08-02
 
The Morrow Plots at the University of Illinois at Urbana-Champaign are the longest-running continuous experimental plots in the Americas. In continuous operation since 1876, the plots were established to explore the impact of crop rotation and soil treatment on corn crop yields. In 2018, The Morrow Plots Data Curation Working Group began to identify, collect and curate the various data records created over the history of the experiment. The resulting data table published here includes planting, treatment and yield data for the Morrow Plots since 1888. Please see the included codebook for a detailed explanation of the data sources and their content. This dataset will be updated as new yield data becomes available. *NOTE: While digitized and accessed through IDEALS, the physical copy of the field notebook: <a href="https://archon.library.illinois.edu/archives/index.php?p=collections/controlcard&id=11846">Morrow Plots Notebook, 1876-1913, 1967</a> is also held at the University of Illinois Archives.
keywords: Corn; Crop Science; Experimental Fields; Crop Yields; Agriculture; Illinois; Morrow Plots
published: 2024-10-18
 
Exhaustive species inventory of suburban wetland complex in northeast Ohio (Cuyahoga County).
keywords: floristic survey; wetland complex; comprehensive species list
published: 2024-10-16
 
School testing data were provided by Shield Illinois (ShieldIL), which conducted weekly in-school testing on behalf of the Illinois Department of Public Health (IDPH) for all participating schools in the state excluding Chicago Public Schools. The populations and proportions of students and employees in the studied school districts are reported by Elementary/Secondary Information System (ElSi) database.
keywords: COVID-19; school testing
published: 2024-10-12
 
Simulation data used to generate plots in the associated paper ("Strain rate controls alignment in growing bacterial monolayers").
published: 2024-10-11
 
This is the core data for Influence of ecological characteristics and phylogeny on native plant species’ commercial availability, a manuscript pending publication in Ecological Applications. The data regard ecological characteristics, phenology, and phylogeny of plant species native to the Midwestern United States and how those factors relate to commercial availability.
keywords: biodiversity; native plant nursery; plant trade; plant vendors; restoration