published: 2020-12-31
This dataset contains the amino acid and nucleotide alignments corresponding to the phylogenetic analyses of South et al. 2020 in Systematic Entomology. This dataset also includes the gene trees that were used as input for coalescent analysis in ASTRAL.
keywords: Plecoptera; stoneflies; phylogeny; insects
published: 2020-10-27
The data file contains detailed information of the Cochrane reviews that were used in a project associated with the manuscript (working title) "Evaluation of an automated probabilistic RCT Tagger applied to published Cochrane reviews".
keywords: Cochrane reviews; systematic reviews; randomized control trial; RCT; automation
published: 2020-10-28
We studied we examined the role of stream flow on environmental DNA (eDNA) concentrations and detectability of an invasive clam (Corbicula fluminea), while also accounting for other abiotic and biotic variables. This data includes the eDNA concentrations, quadrat estimates of clam density, and abiotic variables.
keywords: Corbicula; detection probability; eDNA; invasive species; lotic; occupancy modeling
published: 2020-10-27
The data file contains a list of included studies with their detailed metadata, taken from Cochrane reviews which were used in a project associated with the manuscript "Evaluation of an automated probabilistic RCT Tagger applied to published Cochrane reviews".
keywords: Cochrane reviews; automation; randomized controlled trial; RCT; systematic review
has sharing link
planned publication date: 2023-07-17
This repository contains the training dataset associated with the 2023 Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics (DGM-Image Challenge), hosted by the American Association of Physicists in Medicine. This dataset contains more than 100,000 8-bit images of size 512x512. These images emulate coronal slices from anthropomorphic breast phantoms adapted from the VICTRE toolchain [1], with assigned X-ray attenuation coefficients relevant for breast computed tomography. Please follow the instructions given on the following page in order to register for the challenge: <a href="https://www.aapm.org/GrandChallenge/DGM-Image/">https://www.aapm.org/GrandChallenge/DGM-Image/</a>. [1] Badano, Aldo, et al. <a href="https://doi.org/10.1001/jamanetworkopen.2018.5474">"Evaluation of digital breast tomosynthesis as replacement of full-field digital mammography using an in-silico imaging trial." </a>JAMA network open 1.7 (2018): e185474-e185474
keywords: Deep generative models; breast computed tomography
published: 2023-01-12
These processing and Pearson correlational scripts were developed to support the study that examined the correlational relationships between local journal authorship, local and external citation counts, full-text downloads, link-resolver clicks, and four global journal impact factor indices within an all-disciplines journal collection of 12,200 titles and six subject subsets at the University of Illinois at Urbana-Champaign (UIUC) Library. This study shows strong correlations in the all-disciplines set and most subject subsets. Special processing scripts and web site dashboards were created, including Pearson correlational analysis scripts for reading values from relational databases and displaying tabular results. The raw data used in this analysis, in the form of relational database tables with multiple columns, is available at <a href="https://doi.org/10.13012/B2IDB-6810203_V1">https://doi.org/10.13012/B2IDB-6810203_V1</a>.
keywords: Pearson Correlation Analysis Scripts; Journal Publication; Citation and Usage Data; University of Illinois at Urbana-Champaign Scholarly Communication
published: 2023-01-12
This dataset was developed as part of a study that examined the correlational relationships between local journal authorship, local and external citation counts, full-text downloads, link-resolver clicks, and four global journal impact factor indices within an all-disciplines journal collection of 12,200 titles and six subject subsets at the University of Illinois at Urbana-Champaign (UIUC) Library. While earlier investigations of the relationships between usage (downloads) and citation metrics have been inconclusive, this study shows strong correlations in the all-disciplines set and most subject subsets. The normalized Eigenfactor was the only global impact factor index that correlated highly with local journal metrics. Some of the identified disciplinary variances among the six subject subsets may be explained by the journal publication aspirations of UIUC researchers. The correlations between authorship and local citations in the six specific subject subsets closely match national department or program rankings. All the raw data used in this analysis, in the form of relational database tables with multiple columns. Can be opned using MS Access. Description for variables can be viewed through "Design View" (by right clik on the selected table, choose "Design View"). The 2 PDF files provide an overview of tables are included in each MDB file. In addition, the processing scripts and Pearson correlation code is available at <a href="https://doi.org/10.13012/B2IDB-0931140_V1">https://doi.org/10.13012/B2IDB-0931140_V1</a>.
keywords: Usage and local citation relationships; publication; citation and usage metrics; publication; citation and usage correlation analysis; Pearson correlation analysis
has sharing link
published: 2023-01-10
Agriculture is the largest user of water in the United States. Yet, we do not understand the spatially resolved sources of irrigation water use by crop. The goal of this study is to estimate crop-specific irrigation water use from surface water withdrawals, total groundwater withdrawals, and nonrenewable groundwater depletion for the Continental United States. Water use by source is provided for 20 crops and crop groups from 2008 to 2020 at the county spatial resolution. These results present the first national-scale assessment of irrigation by crop, county, water source, and year. In total, there are nearly 2.5 million data points in this dataset (3,142 counties; 13 years; 3 water sources; and 20 crops). This dataset supports the paper by Ruess et al (2023) in Water Resources Research, https://doi.org/10.1029/2022WR032804. When using, please cite as: Ruess, P.J., Konar, M., Wanders, N. , & Bierkens, M. (2023). Irrigation by crop in the Continental United States from 2008 to 2020, Water Resources Research, 59, e2022WR032804. https://doi.org/10.1029/2022WR032804
keywords: Water use; irrigation; surface water; groundwater; groundwater depletion; counties; crops; time series
published: 2023-01-05
This is the data used in the paper "Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data". Code from the Github repository https://github.com/adtonks/mosquito_GNN can be used with the data here to reproduce the paper's results.
keywords: west nile virus; machine learning; gnn; mosquito; trap; graph neural network; illinois; geospatial
published: 2022-12-22
The relationship between physical activity and mental health, especially depression, is one of the most studied topics in the field of exercise science and kinesiology. Although there is strong consensus that regular physical activity improves mental health and reduces depressive symptoms, some debate the mechanisms involved in this relationship as well as the limitations and definitions used in such studies. Meta-analyses and systematic reviews continue to examine the strength of the association between physical activity and depressive symptoms for the purpose of improving exercise prescription as treatment or combined treatment for depression. This dataset covers 27 review articles (either systematic review, meta-analysis, or both) and 365 primary study articles addressing the relationship between physical activity and depressive symptoms. Primary study articles are manually extracted from the review articles. We used a custom-made workflow (Fu, Yuanxi. (2022). Scopus author info tool (1.0.1) [Python]. <a href="https://github.com/infoqualitylab/Scopus_author_info_collection">https://github.com/infoqualitylab/Scopus_author_info_collection</a> that uses the Scopus API and manual work to extract and disambiguate authorship information for the 392 reports. The author information file (author_list.csv) is the product of this workflow and can be used to compute the co-author network of the 392 articles. This dataset can be used to construct the inclusion network and the co-author network of the 27 review articles and 365 primary study articles. A primary study article is "included" in a review article if it is considered in the review article's evidence synthesis. Each included primary study article is cited in the review article, but not all references cited in a review article are included in the evidence synthesis or primary study articles. The inclusion network is a bipartite network with two types of nodes: one type represents review articles, and the other represents primary study articles. In an inclusion network, if a review article includes a primary study article, there is a directed edge from the review article node to the primary study article node. The attribute file (article_list.csv) includes attributes of the 392 articles, and the edge list file (inclusion_net_edges.csv) contains the edge list of the inclusion network. Collectively, this dataset reflects the evidence production and use patterns within the exercise science and kinesiology scientific community, investigating the relationship between physical activity and depressive symptoms. FILE FORMATS 1. article_list.csv - Unicode CSV 2. author_list.csv - Unicode CSV 3. Chinese_author_name_reference.csv - Unicode CSV 4. inclusion_net_edges.csv - Unicode CSV 5. review_article_details.csv - Unicode CSV 6. supplementary_reference_list.pdf - PDF 7. README.txt - text file UPDATES IN THIS VERSION COMPARED TO V1(Clarke, Caitlin; Lischwe Mueller, Natalie; Joshi, Manasi Ballal; Fu, Yuanxi; Schneider, Jodi (2022): The Inclusion Network of 27 Review Articles Published between 2013-2018 Investigating the Relationship Between Physical Activity and Depressive Symptoms. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4614455_V1) In V1, we did not upload the file "article_list.csv." We uploaded the missing file in this version, and everything else remains the same.
keywords: systematic reviews; meta-analyses; evidence synthesis; network visualization; tertiary studies; physical activity; depressive symptoms; exercise; review articles
published: 2022-12-21
This dataset is associated with a larger manuscript published in 2022 in the Illinois Natural History Survey Bulletin that summarized the Fishes of Champaign County project from 2012-1025. With data spanning over 120 years, the Fishes of Champaign County is a comprehensive, long-term investigation into the changing fish communities of east-central Illinois. Surveys first occurred in Champaign County in the late 1880s (40 sites), with subsequent surveys in 1928–1929 (125 sites), 1959–1960 (143 sites), and 1987–1988 (141 sites). Between 2012 and 2015, we resampled 122 sites across Champaign County. The combined data from these five surveys have produced a unique perspective into not only the fish communities of the region, but also insight into in-stream habitat changes during the past 120 years. The dataset is in Microsoft Access format, with five data tables, one for each time period surveyed. Field names are self-explanatory, with some variation in data types collected during different surveys as follows: Forbes & Richardson (1880s) collected presence/absence only. Thompson & Hunt (1928-1929) collected abundance only, Larimore & Smith (1959-1960) collected length and weight for some samples, but only presence/absence at others. In some cases, fish of the same species were weighed in bulk, with the fields “LOW” and “HIGH” indicating the lower and upper limits of total length in the batch, and weight indicating the gross weight of all fish in the batch. Larimore and Bayley (1987-1988) collected length and weight for all surveys, and Sherwood and Stein (2012-2015) collected length and weight for all surveys except for cases where extremely abundant single species where subsampled. Lengths are reported in millimeters, and weight in grams. Two lookup tables provide information about species codes used in the data tables and sample site location and notes.
keywords: fishes of Champaign County; streams; anthropogenic disturbances; long-term dataset
published: 2022-12-28
The effect of pesticide contamination on arthropod biomass and diversity in simulated prairie restorations depended on arthropod feeding guild (e.g., predator, herbivore, or pollinator). The pesticides used in this study were the neonicotinoid insecticide clothianidin and the phthalimide fungicide captan. This dataset includes two data files. The first contains information about the study sites ("plots") and pesticide treatments. The second contains information about arthropod biomass and morphospecies richness separated by feeding guild for each month-plot combination. R code in an R Markdown file for the analysis and data presentation in the associated publication is also provided. Detected effects included: predator biomass was 66% lower in plots treated with clothianidin, and this effect persisted across the growing season; the impact on herbivore biomass appeared to be inconsistent, with biomass being 51% lower with clothianidin in June but no detected difference in July or August; herbivore morphospecies richness was 12% lower in plots treated with both clothianidin and captain; pollinators appeared to be unaffected by clothianidin; and pollinator biomass increased by 71% when captan was applied to a plot.
keywords: Arthropod decline; pesticide; clothianidin; captan; habitat restoration; trophic effects; insects
published: 2022-12-31
Trajectory data for Nature Nanotechnology manuscript "DNA double helix, a tiny electromotor" that demonstrates how an electric field applied along the helical axis of a DNA or RNA molecule will generate an electroosmotic flow that causes the duplex to spin about that axis, much like a turbine.
keywords: All-atom MD simulation; DNA; nanotechnology; motors and rotors
published: 2022-12-11
The data are original electron micrographs from the lab of the late Dr. Burt Endo of the USDA. These data were digitized from photographic prints and glass plate negatives at 600 DPI as 16 bit TIFF files. This fourth version added 6 new ZIP files from the Endo data collection. "Endo folder database.xlsx" is updated to reflect the addition. Information in "Readme_FileNameFormatting.docx" remains the same as in V3.
keywords: Heterodera glycines; Meloidogyne incognita; Burt Endo; nematode
published: 2022-12-07
The Morrow Plots at the University of Illinois at Urbana-Champaign are the longest-running continuous experimental plots in the Americas. In continuous operation since 1876, the plots were established to explore the impact of crop rotation and soil treatment on corn crop yields. In 2018, The Morrow Plots Data Curation Working Group began to identify, collect and curate the various data records created over the history of the experiment. The resulting data table published here includes planting, treatment and yield data for the Morrow Plots since 1888. Please see the included codebook for a detailed explanation of the data sources and their content. This dataset will be updated as new yield data becomes available. *NOTE: While digitized and accessed through IDEALS, the physical copy of the field notebook: <a href="https://archon.library.illinois.edu/archives/index.php?p=collections/controlcard&id=11846">Morrow Plots Notebook, 1876-1913, 1967</a> is also held at the University of Illinois Archives.
keywords: Corn; Crop Science; Experimental Fields; Crop Yields; Agriculture; Illinois; Morrow Plots
published: 2022-12-05
These are similarity matrices of countries based on dfferent modalities of web use. Alexa website traffic, trending vidoes on Youtube and Twitter trends. Each matrix is a month of data aggregated
keywords: Global Internet Use
planned publication date: 2023-06-01
Results of RT-LAMP reactions for influenza A virus diagnostic development.
keywords: swine influenza; LAMP; gBlock
published: 2022-11-28
The compiled datasets include county-level variables used for simulating miscanthus and switchgrass production in 2287 counties across the rainfed US including 5-year (2012-2016) averaged growing season degree days (GDD), 5-year (2012-2016) averaged growing season cumulative precipitation, National Commodity Crop Productivity Index (NCCPI) values, regional dummies (only for miscanthus), the regional-level random effect of the yield response function, N price, land cash rent, the first year fixed cost (only for switchgrass), and separate datasets for simulating an alternative model assuming a constant N rate. The GAMS codes are used to run the simulation to obtain the main results including the age-varying profit-maximizing N rate, biomass yields, and annual profits for miscanthus and switchgrass production across counties in the rainfed US. The STATA codes are used to merge and analyze simulation results and create summary statistics tables and key figures.
keywords: Age; Miscanthus; Net present value; Nitrogen; Optimal lifespan; Profit maximization; Switchgrass; Yield; Center for Advanced Bioenergy and Bioproducts Innovation
published: 2022-11-28
Detection data of carnivores and their prey species from camera traps in Fort Hood, Texas and Santa Cruz, California, USA. Non-carnivore and non-prey species (humans, domestic species, avian species, etc.) were excluded from this dataset. All detections of each species at a camera within 30 minutes have been combined to 1 detection (only first detection within that 30 minutes kept) to avoid pseudoreplication. Variable Description: Site= Study area data were collected MonitoringPeriod= year in which data was collected (data were collected at each location over multiple monitoring periods) CameraName= Unique name for each camera location Date= calendar date of detection Time= time of detection -Fort Hood= Central Time USA -Santa Cruz= Pacific Time USA Species= Common name of species detected
keywords: carnivore; community ecology; competition; interspecific interactions; keystone species; mesopredator; predation; trophic cascade
published: 2022-11-11
This dataset is for characterizing chemical short-range-ordering in CrCoNi medium entropy alloys. It has three sub-folders: 1. code, 2. sample WQ, 3. sample HT. The software needed to run the files is Gatan Microscopy Suite® (GMS). Please follow the instruction on this page to install the DM3 GMS: <a href="https://www.gatan.com/installation-instructions#Step1">https://www.gatan.com/installation-instructions#Step1</a> 1. Code folder contains three DM scripts to be installed in Gatan DigitalMicrograph software to analyze scanning electron nanobeam diffraction (SEND) dataset: Cepstrum.s: need [EF-SEND_sampleWQ_cropped_aligned.dm3] in Sample WQ and the average image from [EF-SEND_sampleWQ_cropped_aligned.dm3]. Same for Sample HT folder. log_BraggRemoval.s: same as above. Patterson.s: Need refined diffuse patterns in Sample HT folder. 2. Sample WQ and 3. Sample HT folders both contain the SEND data (.ser) and the binned SEND data (.dm3) as well as our calculated strain maps as the strain measurement reference. The Sample WQ folder additionally has atomic resolution STEM images; the Sample HT folder additionally has three refined diffuse patterns as references for diffraction data processing. * Only .ser file is needed to perform the strain measurement using imToolBox as listed in the manuscript. .emi file contains the meta data of the microscope, which can be opened together with .ser file using FEI TIA software.
keywords: Medium entropy alloy; CrCoNi; chemical short-range-ordering; CSRO; TEM
published: 2022-11-09
This dataset includes the blue water intensity by sector (41 industries and service sectors) for provinces in China, economic and virtual water network flow for China in 2017, and the corresponding network properties for these two networks.
keywords: Economic network; Virtual water; Supply chains; Network analysis; Multilayer; MRIO
published: 2022-11-07
The dataset contains the data and code for Single-cell and Subcellular Analysis of freshly isolated cultured, uncultured P1 cells and uncultured Old cells. The .csv file named 'MagLab20220721' contains the sample and intensity information with the columns referring to the m/z values and the rows being the samples. The 'MagLabNameINdex.csv' file contains all the index information. The file named '20220721_MagLab.spydata' contains the loaded data of both the two previous files in Spyder. The .mat file contains the aligned data for the three groups.
keywords: Single-cell; Subcellular; Mass Spectrometry; MALDI; Lipidomics; FTICR; 21 T
published: 2020-12-07
This page contains the data for the publication "Regulation of growth and cell fate during tissue regeneration by the two SWI/SNF chromatin-remodeling complexes of Drosophila" published in Genetics, 2020