Displaying datasets 126 - 150 of 478 in total

Subject Area

Life Sciences (254)
Social Sciences (114)
Physical Sciences (68)
Technology and Engineering (38)
Uncategorized (3)
Arts and Humanities (1)


U.S. National Science Foundation (NSF) (139)
Other (129)
U.S. National Institutes of Health (NIH) (49)
U.S. Department of Energy (DOE) (42)
U.S. Department of Agriculture (USDA) (23)
Illinois Department of Natural Resources (IDNR) (10)
U.S. National Aeronautics and Space Administration (NASA) (5)
U.S. Geological Survey (USGS) (5)
Illinois Department of Transportation (IDOT) (1)
U.S. Army (1)

Publication Year

2021 (109)
2020 (96)
2022 (76)
2019 (72)
2018 (59)
2017 (35)
2016 (30)
2023 (1)


CC0 (280)
CC BY (186)
custom (12)
published: 2021-07-10
This dataset containes the images of B73xMS71 RIL population used in QTL linkage mapping for maize epidermal traits in year 2016 and 2017. 2016RIL_all_mns.rar and 2017RIL_all_mns.rar: contain raw images produced by Nanofocus lsurf Explorer Optical Topometer (Oberhausen, Germany) at 20X magnification with 0.6 numerical aperture. Files were processed in Nanofocus μsurf analysis extended software (Oberhausen,Germany). 2016RIL_all_TIF.rar and 2017RIL_all_TIF.rar: contain images processed from the Topology layer in each nms file to strengthen the edges of cell outlines, and used in downstream cell detection. 2016RIL_all_detection_result.rar and 2017RIL_all_detection_result.rar: contain images with epidermal cells predicted using the Mask R-CNN model. training data.rar: contain images used for Mask R-CNN model training and validation.
keywords: stomata; Mask R-CNN; cell segmentation; water use efficiency
published: 2021-06-24
This dataset contains EEG and Temperature data acquired from inside the bore of an MRI scanner during scanning with two different types of fMRI sequences: single-band and and multi-band. The EEG data were acquired from the heads of adult humans undergoing scanning, and can be used to assess differences in EEG data quality due to sequence type. The temperature data were acquired from a watermelon phantom and can be used to assess heating differences due to sequence type.
keywords: Simultaneous EEG-fMRI, Multi-band fMRI, Safety, Heating
published: 2021-06-24
This dataset consists of the secondary ion mass spectrometry (SIMS) depth profiling data that was collected with a Cameca NanoSIMS 50 instrument from a 10 micron by 10 micron region on a Madin-Darby canine kidney (MDCK) cell that had been metabolically labeled so most of its sphingolipids and cholesterol contained the rare nitrogen-15 oxygen-18 isotopes, respectively.
keywords: secondary ion mass spectrometry; NanoSIMS; depth profiling; MDCK cell; sphingolipids; cholesterol
published: 2021-06-16
Thank you for using these datasets. These RNAsim aligned fragmentary sequences were generated from the query sequences selected by Balaban et al. (2019) in their variable-size datasets (https://doi.org/10.5061/dryad.78nf7dq). They were created for use for phylogenetic placement with the multiple sequence alignments and backbone trees provided by Balaban et al. (2019). The file structures included here also correspond with the data Balaban et al. (2020) provided. This includes: Directories for five varying backbone tree sizes, shown as 5000, 10000, 50000, 100000, and 200000. These directory names are also used by Balaban et al. (2019), and indicate the size of the backbone tree included in their data. Subdirectories for each replicate from the backbone tree size labelled 0 through 4. For the smaller four backbone tree sizes there are five replicates, and for the largest there is one replicate. Each replicate contains 200 text files with one aligned query sequence fragment in fasta format.
keywords: Fragmentary Sequences; RNAsim
published: 2021-06-17
Model output dataset (6-hourly) from the Weather Research and Forecasting (WRF) model simulations over South America with the added capability of water vapor tracers to track the moisture that originates over the Amazon and the La Plata river basins. The simulations were performed for the period 2003-2013 at 20-km horizontal resolution fully coupled with the Noah-MP land surface model. Limited number of original output variables sufficient for reproducing the analyses in papers that cite this dataset are included here. The attached wrfout_southamerica_readme.txt contains detailed information about the file format and variables. For the complete model dataset, contact francina@illinois.edu.
keywords: WRF; Amazon; La Plata; South America; Numerical tracers
published: 2021-06-14
This repository contains the weights for two StyleGAN2 networks trained on two composite T1 and T2 weighted open-source brain MR image datasets, and one StyleGAN2 network trained on the Flickr Face HQ image dataset. Example images sampled from the respective StyleGANs are also included. The datasets themselves are not included in this repository. The weights are stored as `.pkl` files. The code and instructions to load and use the weights can be found at https://github.com/comp-imaging-sci/pic-recon . Additional details and citations can be found in the file "README.md".
keywords: StyleGAN2; Generative adversarial network (GAN); MRI; Medical imaging
published: 2021-06-14
Chronic contact exposure to realistic soil concentrations (0, 7.5, 15, and 100 ppb) of the neonicotinoid pesticide imidacloprid had species- and sex-specific effects on adult bee movement characteristics, but not on adult female bee brain development. This dataset contains two data files. The first contains information about adult bee movement characteristics for female Osmia lignaria and female and male Megachile rotundata over a 10-minute trial (total distance traveled and average movement speed). The second contains information about female Osmia lignaria and Megachile rotundata adult brain morphology. Detected effects included: female Osmia lignaria adults moved faster as they aged in the 0 and 7.5 ppb, but not in the 15 or 100 ppb, groups; young male Megachile rotundata adults moved more quickly (7.5 and 100 ppb) and farther (100 ppb) when treated with imidacloprid compared to the control group (0 ppb); and, while there was no impact of imidacloprid on adult female neuropil:Kenyon cell volume (N:K), N:K decreased with Osmia ligaria adult age and increased with Megachile rotundata adult age.
keywords: neonicotinoid; imidacloprid; bee; movement
published: 2021-05-26
Steady-state and dynamic gas exchange data for maize (B73), sugarcane (CP88-1762) and sorghum (Tx430)
keywords: C4 plants; gas exchange
published: 2021-05-21
Data sets from "Inferring Species Trees from Gene-Family with Duplication and Loss using Multi-Copy Gene-Family Tree Decomposition." It contains trees and sequences simulated with gene duplication and loss under a variety of different conditions. <b>Note:</b> - trees.tar.gz contains the simulated gene-family trees used in our experiments (both true trees from SimPhy as well as trees estimated from alignements). - sequences.tar.gz contains simulated sequence data used for estimating the gene-family trees as well as the concatenation analysis. - biological.tar.gz contains the gene trees used as inputs for the experiments we ran on empirical data sets as well as species trees outputted by the methods we tested on those data sets. - stats.txt list statistics (such as AD, MGTE, and average size) for our simulated model conditions.
keywords: gene duplication and loss; species-tree inference; simulated data;
published: 2021-05-14
Please cite as: Menglin Liu and Benjamin M. Gramig. "Survey of Cover Crop, Conservation Tillage and Nutrient Management Practice Usage in Illinois and 2020 Fall Covers for Spring Savings Crop Insurance Discount Program Participation." Report to the Illinois Department of Agriculture and Fall Covers for Spring Savings working group. Center for the Economics of Sustainability and Department of Agricultural and Consumer Economics, University of Illinois at Urbana-Champaign. 2021. https://doi.org/10.13012/B2IDB-5222984_V1
keywords: cover crops; Illinois; 2020; conservation tillage; nutrient management practices; farmer survey; NLRS
published: 2021-05-14
This is the complete dataset for the "Anomalous density fluctuations in a strange metal" Proceedings of the National Academy of Sciences publication (https://doi.org/10.1073/pnas.1721495115). This is an integration of the Zenodo dataset which includes raw M-EELS data. <b>METHODOLOGICAL INFORMATION</b> 1. Description of methods used for collection/generation of data: Data have been collected with a M-EELS instrument and according to the data acquisition protocol described in the original PNAS publication and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026) 2. Methods for processing the data: Raw data were collected with a channeltron-based M-EELS apparatus described in the reference PNAS publication and analyzed according to the procedure outlined both in the PNAS paper and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026). The raw M-EELS spectra at each momentum have been subject to minor data processing involving: (a) averaging of different acquisitions at the same conditions, (b) energy binning, (c) division of an effective Coulomb matrix element (which yields a structure factor S(q,\omega)), (d) antisymmetrization (which yields the imaginary chi) All these procedures are described in the PNAS paper. 3. Instrument- or software-specific information needed to interpret the data: These data are simple .txt or .dat files which can be read with any standard data analysis software, notably Python notebooks, MatLab, Origin, IgorPro, and others. We do not include scripts in order to provide maximum flexibility. 4. Relationship between files, if important: We divided in different folders raw data, structure factors and imaginary chi. <b>DATA-SPECIFIC INFORMATION</b> There are 8 folders within the Data_public_deposition_v1.zip. Each folder contain data needed to create the corresponding figure in the publication. <b>1. Fig1:</b> This folder contains 21 DAT files needed to plot the theory data in panels C and D, following this naming conventions: [chiA]or[chiB]or[Pi]_q_number.dat With chiA is the imaginary RPA charge susceptibility with a Coulomb interaction of electronically weakly coupled layers chiB is the imaginary RPA charge susceptibility with the usual 4\pi e^2/q^2 Coulomb interaction. Pi is the imaginary Lindhard polarizability. q is momentum in reciprocal lattice units Number is the numerical momentum value in reciprocal lattice units <b>2. Fig2:</b> Files needed to plot Fig. 2 of the PNAS paper. Contains 3 folders as listed below. The files in this folder are named following this convention: Bi2212_295K_(1,-1)_50eV_161107_q_number_2.16_avg.dat, 295K is the sample temperature (1,-1) is the momentum direction in reciprocal lattice units 50 eV is the incident e beam energy 161107 is the start date of the experiment in yymmdd format Q is the momentum Number is the momentum in reciprocal lattice units 2.16 is the energy range covered by the data in eV Avg identifies averaged data ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>3. Fig3:</b> Files needed to plot Fig. 3 of the PNAS paper. OP/ OD prefix identifies optimally doped or overdosed sample data, respectively. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>4. Fig4:</b> Files needed to plot Fig. 4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>5. FigS1:</b> Files needed to plot Fig. S1 of the PNAS paper. There are 5 files in this folder. DAT files are M-EELS data following the prior naming convention, while the two .txt files are digitized data from N. Nücker, U. Eckern, J. Fink, and P. Müller, Long-Wavelength Collective Excitations of Charge Carriers in High-Tc Superconductors, Phys. Rev. B 44, 7155(R) (1991), and K. H. G. Schulte, The interplay of Spectroscopy and Correlated Materials, Ph.D. thesis, University of Groningen (2002). <b>6. FigS2:</b> Files needed to plot Fig. S2 of the PNAS paper. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>7. FigS3:</b> Files needed to plot Fig. S3 of the PNAS paper. There are 2 files in this folder: 20K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 20 K 295K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 295 K <b>8. FigS4:</b> Files needed to plot Fig. S4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra
keywords: Momentum resolved electron energy loss spectroscopy (M-EELS); cuprates; plasmons; strange metal
published: 2021-05-17
Please cite as: Wuebbles, D., J. Angel, K. Petersen, and A.M. Lemke, (Eds.), 2021: An Assessment of the Impacts of Climate Change in Illinois. The Nature Conservancy, Illinois, USA. https://doi.org/10.13012/B2IDB-1260194_V1 Climate change is a major environmental challenge that is likely to affect many aspects of life in Illinois, ranging from human and environmental health to the economy. Illinois is already experiencing impacts from the changing climate and, as climate change progresses and temperatures continue to rise, these impacts are expected to increase over time. This assessment takes an in-depth look at how the climate is changing now in Illinois, and how it is projected to change in the future, to provide greater clarity on how climate change could affect urban and rural communities in the state. Beyond providing an overview of anticipated climate changes, the report explores predicted effects on hydrology, agriculture, human health, and native ecosystems.
keywords: Climate change; Illinois; Public health; Agriculture; Environment; Water; Hydrology; Ecosystems
published: 2021-05-12
These are the data sets associated with our publication "Field borders provide winter refuge for beneficial predators and parasitoids: a case study on organic farms." For this project, we compared the communities of overwintering arthropod natural enemies in organic cultivated fields and wildflower-strip field borders at five different sites in central Illinois. Abstract: Semi-natural field borders are frequently used in midwestern U.S. sustainable agriculture. These habitats are meant to help diversify otherwise monocultural landscapes and provision them with ecosystem services, including biological control. Predatory and parasitic arthropods (i.e., potential natural enemies) often flourish in these habitats and may move into crops to help control pests. However, detailed information on the capacity of semi-natural field borders for providing overwintering refuge for these arthropods is poorly understood. In this study, we used soil emergence tents to characterize potential natural enemy communities (i.e., predacious beetles, wasps, spiders, and other arthropods) overwintering in cultivated organic crop fields and adjacent field borders. We found a greater abundance, species richness, and unique community composition of predatory and parasitic arthropods in field borders compared to arable crop fields, which were generally poorly suited as overwintering habitat. Furthermore, potential natural enemies tended to be positively associated with forb cover and negatively associated with grass cover, suggesting that grassy field borders with less forb cover are less well-suited as winter refugia. These results demonstrate that semi-natural habitats like field borders may act as a source for many natural enemies on a year-to-year basis and are important for conserving arthropod diversity in agricultural landscapes.
keywords: Natural enemy; wildflower strips; conservation biological control; semi-natural habitat; field border; organic farming
published: 2021-05-07
Prepared by Vetle Torvik 2021-05-07 The dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters). • How was the dataset created? The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in December, 2018. (NLMs baseline 2018 plus updates throughout 2018). Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only. However, MapAffil 2018 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220). Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records. All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in: Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p • Look for Fig. 4 in the following article for coverage statistics over time: Palmblad, M., Torvik, V.I. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Trop Med Health 45, 33 (2017). <a href="https://doi.org/10.1186/s41182-017-0073-6">https://doi.org/10.1186/s41182-017-0073-6</a> Expect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014. • The code and back-end data is periodically updated and made available for query by PMID at http://abel.ischool.illinois.edu/cgi-bin/mapaffil/search.py • What is the format of the dataset? The dataset contains 52,931,957 rows (plus a header row). Each row (line) in the file has a unique PMID and author order, and contains the following eighteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1. 1. PMID: positive non-zero integer; int(10) unsigned 2. au_order: positive non-zero integer; smallint(4) 3. lastname: varchar(80) 4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed 5. initial_2: middle name initial 6. orcid: From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 7. year: year of the publication 8. journal: name of journal that the publication is published 9. affiliation: author's affiliation?? 10. disciplines: extracted from departments, divisions, schools, laboratories, centers, etc. that occur on at least unique 100 affiliations across the dataset, some with standardization (e.g., 1770799), English translations (e.g., 2314876), or spelling corrections (e.g., 1291843) 11. grid: inferred using a high-recall technique focused on educational institutions (but, for experimental purposes, includes a few select hospitals, national institutes/centers, international companies, governmental agencies, and 200+ other IDs [RINGGOLD, Wikidata, ISNI, VIAF, http] for institutions not in GRID). Based on 2019 GRID version https://www.grid.ac/ 12. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK 13. city: varchar(200); typically 'city, state, country' but could include further subdivisions; unresolved ambiguities are concatenated by '|' 14. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA) 15. country 16. lat: at most 3 decimals (only available when city is not a country or state) 17. lon: at most 3 decimals (only available when city is not a country or state) 18. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find
keywords: PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution; institution name disambiguation
published: 2021-05-10
UAV-based high-resolution multispectral time-series orthophotos utilized to understand the relation between growth dynamics, imagery temporal resolution, and end-of-season biomass productivity of biomass sorghum as bioenergy crop. Sensor utilized is a RedEdge Micasense flown at 40 meters above ground level at the Energy Farm- UIUC in 2019.
keywords: Unmanned aerial vehicles; High throughput phenotyping; Machine learning; Bioenergy crops
published: 2021-05-07
The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018), and for ORCIDs, primarily, the 2019 ORCID Public Data File https://orcid.org/. Matching an ORCID to an individual author name on a PMID is a non-trivial process. Anyone can create an ORCID and claim to have contributed to any published work. Many records claim too many articles and most claim too few. Even though ORCID records are (most?) often populated by author name searches in popular bibliographic databases, there is no confirmation that the person's name is listed on the article. This dataset is the product of mapping ORCIDs to individual author names on PMIDs, even when the ORCID name does not match any author name on the PMID, and when there are multiple (good) candidate author names. The algorithm avoids assigning the ORCID to an article when there are no good candidates and when there are multiple equally good matches. For some ORCIDs that clearly claim too much, it triggers a very strict matching procedure (for ORCIDs that claim too much but the majority appear correct, e.g., 0000-0002-2788-5457), and sometimes deletes ORCIDs altogether when all (or nearly all) of its claimed PMIDs appear incorrect. When an individual clearly has multiple ORCIDs it deletes the least complete of them (e.g., 0000-0002-1651-2428 vs 0000-0001-6258-4628). It should be noted that the ORCIDs that claim to much are not necessarily due nefarious or trolling intentions, even though a few appear so. Certainly many are are due to laziness, such as claiming everything with a particular last name. Some cases appear to be due to test engineers (e.g., 0000-0001-7243-8157; 0000-0002-1595-6203), or librarians assisting faculty (e.g., ; 0000-0003-3289-5681), or group/laboratory IDs (0000-0003-4234-1746), or having contributed to an article in capacities other than authorship such as an Investigator, an Editor, or part of a Collective (e.g., 0000-0003-2125-4256 as part of the FlyBase Consortium on PMID 22127867), or as a "Reply To" in which case the identity of the article and authors might be conflated. The NLM has, in the past, limited the total number of authors indexed too. The dataset certainly has errors but I have taken great care to fix some glaring ones (individuals who claim to much), while still capturing authors who have published under multiple names and not explicitly listed them in their ORCID profile. The final dataset provides a "matchscore" that could be used for further clean-up. Four files: person.tsv: 7,194,692 rows, including header 1. orcid 2. lastname 3. firstname 4. creditname 5. othernames 6. otherids 7. emails employment.tsv: 2,884,981 rows, including header 1. orcid 2. putcode 3. role 4. start-date 5. end-date 6. id 7. source 8. dept 9. name 10. city 11. region 12 country 13. affiliation education.tsv: 3,202,253 rows, including header 1. orcid 2. putcode 3. role 4. start-date 5. end-date 6. id 7. source 8. dept 9. name 10. city 11. region 12 country 13. affiliation pubmed2orcid.tsv: 13,133,065 rows, including header 1. PMID 2. au_order (author name position on the article) 3. orcid 4. matchscore (see below) 5. source: orcid (2019 ORCID Public Data File https://orcid.org/), pubmed (NLMs distributed XML files), or patci (an earlier version of ORCID with citations processed through the Patci tool) 12,037,375 from orcid; 1,06,5892 from PubMed XML; 29,797 from Patci matchscore: 000: lastname, firstname and middle init match (e.g., Eric T MacKenzie vs 00: lastname, firstname match (e.g., Keith Ward) 0: lastname, firstname reversed match (e.g., Conde Santiago vs Santiago Conde) 1: lastname, first and middle init match (e.g., L. F. Panchenko) 11: lastname and partial firstname match (e.g., Mike Boland vs Michael Boland or Mel Ziman vs Melanie Ziman) 12: lastname and first init match 15: 3 part lastname and firstname match (David Grahame Hardie vs D Grahame Hardie) 2: lastname match and multipart firstname initial match Maria Dolores Suarez Ortega vs M. D. Suarez 22: partial lastname match and firstname match (e.g., Erika Friedmann vs Erika Friedman) 23: e.g., Antonio Garcia Garcia vs A G Garcia 25: Allan Downie vs J A Downie 26: Oliver Racz vs Oliver Bacz 27: Rita Ostrovskaya vs R U Ostrovskaia 29: Andrew Staehelin vs L A Staehlin 3: M Tronko vs N D Tron'ko 4: Sharon Dent (Also known as Sharon Y.R. Dent; Sharon Y Roth; Sharon Yoder) vs Sharon Yoder 45: Okulov Aleksei vs A B Okulov 48: Maria Del Rosario Garcia De Vicuna Pinedo vs R Garcia-Vicuna 49: Anatoliy Ivashchenko vs A Ivashenko 5 = lastname match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Bill Hieb vs W F Hieb 6 = first name match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Maria Borawska vs Maria Koscielak 7 = last or first name match on "other names"; e.g., Hromokovska Tetiana (Also known as Gromokovskaia, T. S., Громоковська Тетяна) vs T Gromokovskaia 77: Siva Subramanian vs Kolinjavadi N. Sivasubramanian 88 = no name in orcid but match caught by uniqueness of name across paper (at least 90% and 2 more than next most common name) prefix: C = ambiguity reduced (possibly eliminated) using city match (e.g., H Yang on PMID 24972200) I = ambiguity eliminated by excluding investigators (ie.., one author and one or more investigators with that name) T = ambiguity eliminated using PubMed pos (T for tie-breaker) W = ambiguity resolved by authority2018
published: 2021-05-10
This dataset contains the emulated global multi-model urban daily temperature projections under RCP 8.5 scenario. The dataset is derived from the study "Large model structural uncertainty in global projections of urban heat waves" (XXXX). Details about this dataset and the local urban climate emulator are described in the article. This dataset documents the global urban daily temperatures of 17 CMIP5 Earth system models for 2006-2015 and 2061-2070. This dataset may be useful for multiple communities regarding urban climate change, heat waves, impacts, vulnerability, risks, and adaptation applications.
keywords: Urban heat waves; CMIP; urban warming; heat stress; urban climate change
published: 2021-04-30
This repository includes scripts and datasets for the paper, "Accurate Large-scale Phylogeny-Aware Alignment using BAli-Phy" submitted to Bioinformatics.
keywords: BAli-Phy;Bayesian co-estimation;multiple sequence alignment
published: 2021-04-29
Global assessments of climate extremes typically do not account for the unique characteristics of individual crops. A consistent definition of the exposure of specific crops to extreme weather would enable agriculturally-relevant hazard quantification. We introduce the Agriculturally-Relevant Exposure to Shocks (ARES) model, a novel database of both the temperature and moisture extremes facing individual crops by explicitly accounting for crop characteristics. Specifically, we estimate crop-specific temperature and moisture shocks during the growing season for a 0.25-degree spatial grid and daily time scale from 1961-2014 globally for 17 crops. The resulting database presented here provides annual crop- and event-specific exposure rates. Both gridded and country-level exposure rates are provided for each of the 17 crops. Our results provide new insights into the changes in the magnitude as well as spatial and temporal distribution of extreme events that impact crops over the past half-century. For additional information, please see the related paper by Jackson et al. (2021) in Environmental Research Letters.
keywords: Crop-specific; weather extremes; temperature; moisture; global; gridded; time series
published: 2021-04-28
An Atlas.ti dataset and accompanying documentation of a thematic analysis of problems and opportunities associated with retracted research and its continued citation.
keywords: Retraction; Citation; Problems and Opportunities
published: 2021-04-22
Author-ity 2018 dataset Prepared by Vetle Torvik Apr. 22, 2021 The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018). A total of 29.1 million Article records and 114.2 million author name instances. Each instance of an author name is uniquely represented by the PMID and the position on the paper (e.g., 10786286_3 is the third author name on PMID 10786286). Thus, each cluster is represented by a collection of author name instances. The instances were first grouped into "blocks" by last name and first name initial (including some close variants), and then each block was separately subjected to clustering. The resulting clusters are provided in two different formats, the first in a file with only IDs and PMIDs, and the second in a file with cluster summaries: #################### File 1: au2id2018.tsv #################### Each line corresponds to an author name instance (PMID and Author name position) with an Author ID. It has the following tab-delimited fields: 1. Author ID 2. PMID 3. Author name position ######################## File 2: authority2018.tsv ######################### Each line corresponds to a predicted author-individual represented by cluster of author name instances and a summary of all the corresponding papers and author name variants. Each cluster has a unique Author ID (the PMID of the earliest paper in the cluster and the author name position). The summary has the following tab-delimited fields: 1. Author ID (or cluster ID) e.g., 3797874_1 represents a cluster where 3797874_1 is the earliest author name instance. 2. cluster size (number of author name instances on papers) 3. name variants separated by '|' with counts in parenthesis. Each variant of the format lastname_firstname middleinitial, suffix 4. last name variants separated by '|' 5. first name variants separated by '|' 6. middle initial variants separated by '|' ('-' if none) 7. suffix variants separated by '|' ('-' if none) 8. email addresses separated by '|' ('-' if none) 9. ORCIDs separated by '|' ('-' if none). From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 10. range of years (e.g., 1997-2009) 11. Top 20 most frequent affiliation words (after stoplisting and tokenizing; some phrases are also made) with counts in parenthesis; separated by '|'; ('-' if none) 12. Top 20 most frequent MeSH (after stoplisting) with counts in parenthesis; separated by '|'; ('-' if none) 13. Journal names with counts in parenthesis (separated by '|'), 14. Top 20 most frequent title words (after stoplisting and tokenizing) with counts in parenthesis; separated by '|'; ('-' if none) 15. Co-author names (lowercased lastname and first/middle initials) with counts in parenthesis; separated by '|'; ('-' if none) 16. Author name instances (PMID_auno separated by '|') 17. Grant IDs (after normalization; '-' if none given; separated by '|'), 18. Total number of times cited. (Citations are based on references harvested from open sources such as PMC). 19. h-index 20. Citation counts (e.g., for h-index): PMIDs by the author that have been cited (with total citation counts in parenthesis); separated by '|'
keywords: author name disambiguation; PubMed
published: 2021-04-22
All code in Matlab .m scripts or functions (version R2019b) Affiliated with article “Temperate and chronic virus competition leads to low lysogen frequency” published in the Journal of Theoretical Biology (2021) Codes simulate and plot the solutions of an Ordinary Differential Equations model and generate bifurcation diagrams.
published: 2021-04-19
Dataset compiled by Yushu Xia and Michelle Wander for the Soil Health Institute. Data were recovered from peer reviewed literature reporting results for three soil quality indicators (SQIs) (β-glucosidase (BG), fluorescein diacetate (FDA) hydrolysis, and permanganate oxidizable carbon (POXC)) in terms of their relative response to management where soils under grassland cover, no-tillage, cover crops, residue return and organic amendments were compared to conventionally managed controls. Peer-reviewed articles published between January of 1990 and May 2018 were searched using the Thomas Reuters Web of Science database (Thomas Reuters, Philadelphia, Pennsylvania) and Google Scholar to identify studies reporting results for: “β-glucosidase”, “permanganate oxidizable carbon”, “active carbon”, “readily oxidizable carbon”, and “fluorescein diacetate hydrolysis”, together with one or more of the following: “management practice”, “tillage”, “cover crop”, “residue”, “organic fertilizer”, or “manure”. Records were tabulated to compare SQI abundance in soil maintained under a control and soil aggrading practice with the intent to contribute to SQI databases that will support development of interpretive frameworks and/or algorithms including pedo-transfer functions relating indicator abundance to management practices and site specific factors. Meta-data include the following key descriptor variables and covariates useful for development of scoring functions: 1) identifying factors for the study site (location, year of initiation of study and year in which data was reported), 2) soil textural class, pH, and SOC, 3) depth and timing of soil sampling, 4) analytical methods for SQI quantification, 5) units used in published works (i.e. equivalent mass, concentration), 6) SQI abundances, and 7) statistical significance of difference comparisons. *Note: Blank values in tables are considered unreported data.
keywords: Soil health promoting practices; Soil quality indicators; β-glucosidase; fluorescein diacetate hydrolysis; Permanganate oxidizable carbon; Greenhouse gas emissions; Scoring curves; Soil Management Assessment Framework
published: 2021-04-18
This dataset contains all the code, notebooks, datasets used in the study conducted for the research publication titled "Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19 Data". Specifically, this package include the artifacts used to conduct spatial-temporal analysis with space time kernel density estimation (STKDE) using COVID-19 data, which should help readers to reproduce some of the analysis and learn about the methods that were conducted in the associated book chapter. ## What’s inside - A quick explanation of the components of the zip file * Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19.ipynb is a jupyter notebook for this project. It contains codes for preprocessing, space time kernel density estimation, postprocessing, and visualization. * data is a folder containing all data needed for the notebook * data/county.txt: US counties information and fip code from Natural Resources Conservation Service. * data/us-counties.txt: County-level COVID-19 data collected from New York Times COVID-19 github repository on August 9th, 2020. * data/covid_death.txt: COVID-19 death information derived after preprocessing step, preparing the input data for STKDE. Each record is if the following format (fips, spatial_x, spatial_y, date, number of death ). * data/stkdefinal.txt: result obtained by conducting STKDE. * wolfram_mathmatica is a folder for 3D visulization code. * wolfram_mathmatica/Visualization.nb: code for visulization of STKDE result via weolfram mathmatica. * img is a folder for figures. * img/above.png: result of 3-D visulization result, above view. * img/side.png: result of 3-D visulization, side view.
keywords: CyberGIS; COVID-19; Space-time kernel density estimation; Spatiotemporal patterns
published: 2021-04-15
To generate the bibliographic and survey data to support a data reuse study conducted by several Library faculty and accepted for publication in the Journal of Academic Librarianship, the project team utilized a series of web-based online scripts that employed several different endpoints from the Scopus API. The related dataset: "Data for: An Examination of Data Reuse Practices within Highly Cited Articles of Faculty at a Research University" contains survey design and results. <br /> 1) <b>getScopus_API_process_dmp_IDB.asp</b>: used the search API query the Scopus database API for papers by UIUC authors published in 2015 -- limited to one of 9 pre-defined Scopus subject areas -- and retrieve metadata results sorted highest to lowest by the number of times the retrieved articles were cited. The URL for the basic searches took the following form: https://api.elsevier.com/content/search/scopus?query=(AFFIL%28(urbana%20OR%20champaign) AND univ*%29) OR (AF-ID(60000745) OR AF-ID(60005290))&apikey=xxxxxx&start=" & nstart & "&count=25&date=2015&view=COMPLETE&sort=citedby-count&subj=PHYS<br /> Here, the variable nstart was incremented by 25 each iteration and 25 records were retrieved in each pass. The subject area was renamed (e.g. from PHYS to COMP for computer science) in each of the 9 runs. This script does not use the Scopus API cursor but downloads 25 records at a time for up to 28 times -- or 675 maximum bibliographic records. The project team felt that looking at the most 675 cited articles from UIUC faculty in each of the 9 subject areas was sufficient to gather a robust, representative sample of articles from 2015. These downloaded records were stored in a temporary table that was renamed for each of the 9 subject areas. <br /> 2) <b>get_citing_from_surveys_IDB.asp</b>: takes a Scopus article ID (eid) from the 49 UIUC author returned surveys and retrieves short citing article references, 200 at a time, into a temporary composite table. These citing records contain only one author, no author affiliations, and no author email addresses. This script uses the Scopus API cursor=* feature and is able to download all the citing references of an article 200 records at a time. <br /> 3) <b>put_in_all_authors_affil_IDB.asp</b>: adds important data to the short citing records. The script adds all co-authors and their affiliations, the corresponding author, and author email addresses. <br /> 4) <b>process_for_final_IDB.asp</b>: creates a relational database table with author, title, and source journal information for each of the citing articles that can be copied as an Excel file for processing by the Qualtrics survey software. This was initially 4,626 citing articles over the 49 UIUC authored articles, but was reduced to 2,041 entries after checking for available email addresses and eliminating duplicates.
keywords: Scopus API; Citing Records; Most Cited Articles