Displaying 101 - 125 of 738 in total
Subject Area
Funder
Publication Year
License
Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2022-02-11
 
The Culex_Trivellone_etal.fas fasta file contains the original final sequence alignment used in the haplotype analyses of Trivellone et al. (Frontiers in Public Health, under review). The 492 sequences (from specimens of Culex pipiens complex collected in different habitat types using a BG-sentinel traps) were aligned using PASTA v1.8.5 under default settings. The final dataset contains 686 positions of the cytochrome c oxidase subunit I (COI) mitochondrial gene. The data analyses are further described in the cited original paper.
keywords: Culex; Culicidae; COI; mosquito surveillance, species assemblages
published: 2022-06-20
 
This is a sentence-level parallel corpus in support of research on OCR quality. The source data comes from: (1) Project Gutenberg for human-proofread "clean" sentences; and, (2) HathiTrust Digital Library for the paired sentences with OCR errors. In total, this corpus contains 167,079 sentence pairs from 189 sampled books in four domains (i.e., agriculture, fiction, social science, world war history) published from 1793 to 1984. There are 36,337 sentences that have two OCR views paired with each clean version. In addition to sentence texts, this corpus also provides the location (i.e., sentence and chapter index) of each sentence in its belonging Gutenberg volume.
keywords: sentence-level parallel corpus; optical character recognition; OCR errors; Project Gutenberg; HathiTrust Digital Library; digital libraries; digital humanities;
published: 2021-06-24
 
This dataset contains EEG and Temperature data acquired from inside the bore of an MRI scanner during scanning with two different types of fMRI sequences: single-band and and multi-band. The EEG data were acquired from the heads of adult humans undergoing scanning, and can be used to assess differences in EEG data quality due to sequence type. The temperature data were acquired from a watermelon phantom and can be used to assess heating differences due to sequence type.
keywords: Simultaneous EEG-fMRI, Multi-band fMRI, Safety, Heating
published: 2021-05-10
 
UAV-based high-resolution multispectral time-series orthophotos utilized to understand the relation between growth dynamics, imagery temporal resolution, and end-of-season biomass productivity of biomass sorghum as bioenergy crop. Sensor utilized is a RedEdge Micasense flown at 40 meters above ground level at the Energy Farm- UIUC in 2019.
keywords: Unmanned aerial vehicles; High throughput phenotyping; Machine learning; Bioenergy crops
published: 2021-04-06
 
These datasets contain modeling files and GIS data associated with a risk assessment study for the Cambrian-Ordovician sandstone aquifer system in Illinois from predevelopment (1863) to the year 2070. Modeling work was completed using the Illinois Groundwater Flow Model, a regional MODFLOW model developed for water supply planning in Illinois, as a base model. The model is run using the graphical user interface Groundwater Vistas 7.0. The development and technical details of the base Illinois Groundwater Flow Model, including hydraulic property zonation, boundary conditions, hydrostratigraphy, solver settings, and discretization, are described in Abrams et al. (2018). Modifications to this base model (the version presented here) are described in Mannix et al. (2018), Hadley et al. (2020) and Abrams and Cullen (2020). Modifications include removal of particular multi-aquifer wells to improve calibration, changing Sandwich Fault Zone properties to achieve calibration at production wells within and near the fault zone, and the incorporation of demand scenarios based on a participatory modeling project with the Southwest Water Planning Group. The zipped folder of model files contains MODFLOW input (package) files, Groundwater Vistas files, and a head file for the entire model run. The zipped folder of GIS data contains rasters of: simulated drawdown in the St. Peter sandstone from predevelopment to 2018, simulated drawdown in the Ironton-Galesville sandstone from predevelopment to 2018, simulated head difference between the St. Peter and Ironton-Galesville sandstone units in 2018, simulated head above the top of the St. Peter sandstone for the years 2029, 2050, and 2070, and simulated head above the top of the Ironton-Galesville sandstone for the years 2029, 2050, and 2070. Raster outputs were derived directly from the simulated heads in the Illinois Groundwater Flow Model. Rasters are clipped to the 8 county northeastern Illinois region (Cook, DuPage, Grundy, Kane, Kendall, Lake, McHenry, and Will counties). Well names, historic and current head targets, and spatial offsets for the Illinois Groundwater Flow Model are available upon request via a data license agreement. Please contact authors to set this up if needed.
keywords: groundwater; aquifer; sandstone aquifer; risk assessment; depletion; Illinois; MODFLOW; modeling
published: 2021-06-14
 
Chronic contact exposure to realistic soil concentrations (0, 7.5, 15, and 100 ppb) of the neonicotinoid pesticide imidacloprid had species- and sex-specific effects on adult bee movement characteristics, but not on adult female bee brain development. This dataset contains two data files. The first contains information about adult bee movement characteristics for female Osmia lignaria and female and male Megachile rotundata over a 10-minute trial (total distance traveled and average movement speed). The second contains information about female Osmia lignaria and Megachile rotundata adult brain morphology. Detected effects included: female Osmia lignaria adults moved faster as they aged in the 0 and 7.5 ppb, but not in the 15 or 100 ppb, groups; young male Megachile rotundata adults moved more quickly (7.5 and 100 ppb) and farther (100 ppb) when treated with imidacloprid compared to the control group (0 ppb); and, while there was no impact of imidacloprid on adult female neuropil:Kenyon cell volume (N:K), N:K decreased with Osmia ligaria adult age and increased with Megachile rotundata adult age.
keywords: neonicotinoid; imidacloprid; bee; movement
published: 2021-06-14
 
This repository contains the weights for two StyleGAN2 networks trained on two composite T1 and T2 weighted open-source brain MR image datasets, and one StyleGAN2 network trained on the Flickr Face HQ image dataset. Example images sampled from the respective StyleGANs are also included. The datasets themselves are not included in this repository. The weights are stored as `.pkl` files. The code and instructions to load and use the weights can be found at https://github.com/comp-imaging-sci/pic-recon . Additional details and citations can be found in the file "README.md".
keywords: StyleGAN2; Generative adversarial network (GAN); MRI; Medical imaging
published: 2021-06-16
 
Thank you for using these datasets. These RNAsim aligned fragmentary sequences were generated from the query sequences selected by Balaban et al. (2019) in their variable-size datasets (https://doi.org/10.5061/dryad.78nf7dq). They were created for use for phylogenetic placement with the multiple sequence alignments and backbone trees provided by Balaban et al. (2019). The file structures included here also correspond with the data Balaban et al. (2020) provided. This includes: Directories for five varying backbone tree sizes, shown as 5000, 10000, 50000, 100000, and 200000. These directory names are also used by Balaban et al. (2019), and indicate the size of the backbone tree included in their data. Subdirectories for each replicate from the backbone tree size labelled 0 through 4. For the smaller four backbone tree sizes there are five replicates, and for the largest there is one replicate. Each replicate contains 200 text files with one aligned query sequence fragment in fasta format.
keywords: Fragmentary Sequences; RNAsim
published: 2021-07-20
 
This dataset contains data from extreme-disagreement analysis described in paper “Aaron M. Cohen, Jodi Schneider, Yuanxi Fu, Marian S. McDonagh, Prerna Das, Arthur W. Holt, Neil R. Smalheiser, 2021, Fifty Ways to Tag your Pubtypes: Multi-Tagger, a Set of Probabilistic Publication Type and Study Design Taggers to Support Biomedical Indexing and Evidence-Based Medicine.” In this analysis, our team experts carried out an independent formal review and consensus process for extreme disagreements between MEDLINE indexing and model predictive scores. “Extreme disagreements” included two situations: (1) an abstract was MEDLINE indexed as a publication type but received low scores for this publication type, and (2) an abstract received high scores for a publication type but lacked the corresponding MEDLINE index term. “High predictive score” is defined as the top 100 high-scoring, and “low predictive score” is defined as the bottom 100 low-scoring. Three publication types were analyzed, which are CASE_CONTROL_STUDY, COHORT_STUDY, and CROSS_SECTIONAL_STUDY. Results were recorded in three Excel workbooks, named after the publication types: case_control_study.xlsx, cohort_study.xlsx, and cross_sectional_study.xlsx. The analysis shows that, when the tagger gave a high predictive score (>0.9) on articles that lacked a corresponding MEDLINE indexing term, independent review suggested that the model assignment was correct in almost all cases (CROSS_SECTIONAL_STUDY (99%), CASE_CONTROL_STUDY (94.9%), and COHORT STUDY (92.2%)). Conversely, when articles received MEDLINE indexing but model predictive scores were very low (<0.1), independent review suggested that the model assignment was correct in the majority of cases: CASE_CONTROL_STUDY (85.4%), COHORT STUDY (76.3%), and CROSS_SECTIONAL_STUDY (53.6%). Based on the extreme disagreement analysis, we identified a number of false-positives (FPs) and false-negatives (FNs). For case control study, there were 5 FPs and 14 FNs. For cohort study, there were 7 FPs and 22 FNs. For cross-sectional study, there were 1 FP and 45 FNs. We reviewed and grouped them based on patterns noticed, providing clues for further improving the models. This dataset reports the instances of FPs and FNs along with their categorizations.
keywords: biomedical informatics; machine learning; evidence based medicine; text mining
published: 2021-08-20
 
In 2020, early-season extreme precipitation events occurred following the planting of Sorghum bicolor (L.) Moench and Zea mays L. in central Illinois that caused ponding. Following the first rainfall event 50m transects were established to assess the waterlogging effects on seedling emergence and crop yields. Soil moisture, emergence, stem and tiller count, LAI, and yield were measured at various points in the season along these transects.
keywords: Sorghum; Maize; Emergence; Yield; LAI
published: 2021-02-24
 
This dataset contains model output from the Community Earth System Model, Version 2 (CESM2; Danabasoglu et al. 2020). These data were used for analysis in Impacts of Large-Scale Soil Moisture Anomalies in Southeastern South America, published in the Journal of Hydrometeorology (DOI: 10.1175/JHM-D-20-0116.1). See this publication for details of the model simulations that created these data. Four NetCDF (.nc) files are included in this dataset. Two files correspond to the control simulation (FHIST_SP_control) and two files correspond to a simulation with a dry soil moisture anomaly imposed in southeastern South America (FHIST_SP_dry; see the publication mentioned in the preceding paragraph for details on the spatial extent of the imposed anomaly). For each simulation, one file corresponds to output from the atmospheric model (file names with "cam") of CESM2 and the other to the land model (file names with "clm2"). These files are raw CESM output concatenated into a single file for each simulation. All files include data from 1979-01-02 to 2003-12-31 at a daily resolution. The spatial resolution of all files is about 1 degree longitude x 1 degree latitude. Variables included in these files are listed or linked below. Variables in atmosphere model output: Vertical velocity (omega) Convective precipitation Large-scale precipitation Surface pressure Specific humidity Temperature (atmospheric profile) Reference temperature (temp. at reference height, 2 meters in this case) Zonal wind Meridional wind Geopotential height Variables in land model output: See https://www.cesm.ucar.edu/models/cesm1.2/clm/models/lnd/clm/doc/UsersGuide/history_fields_table_40.xhtml Note that not all of the variables listed at the above link are included in the land model output files in this dataset. This material is based upon work supported by the National Science Foundation under Grant No. 1454089. We acknowledge high-performance computing support from Cheyenne (doi:10.5065/D6RX99HX) provided by NCAR's Computational and Information Systems Laboratory, sponsored by the National Science Foundation. The CESM project is supported primarily by the National Science Foundation. We thank all the scientists, software engineers, and administrators who contributed to the development of CESM2. References Danabasoglu, G., and Coauthors, 2020: The Community Earth System Model Version 2 (CESM2). Journal of Advances in Modeling Earth Systems, 12, e2019MS001916, https://doi.org/10.1029/2019MS001916.
keywords: Climate modeling; atmospheric science; hydrometeorology; hydroclimatology; soil moisture; land-atmosphere interactions
published: 2021-02-25
 
Total nitrogen leaching rates were calculated over the Mississippi Atchafalaya River Basin (MARB) using an integrated economic-biophysical modeling approach. Land allocation for corn production and total nitrogen application rates were calculated for crop reporting districts using the Biofuel and Environmental Policy Analysis Model (BEPAM) for 5 RFS2 policy scenarios. These were used as input in the Integrated BIosphere Simulator-Agricultural Version (Agro-IBIS) and the Terrestrial Hydrologic Model with Biogeochemistry (THMB) to calculate the nitrogen loss. Land allocation and total nitrogen application simulations were simulated for the period 2016-2030 for 303 crop reporting districts (https://www.nass.usda.gov/Data_and_Statistics/County_Data_Files/Frequently_Asked_Questions/county_list.txt). The final 2030 values are reported here. Both are stored in csv files. Units for land allocation are million ha and nitrogen application are million kg. The nitrogen leaching rates were modeled with a spatial resolution of 5' x 5' using the North American Datum of 1983 projection and stored in NetCDF files. The 30-year average is calculated over the last 30 years of the 45 years being simulated. Leaching rates are calculated in kg-N/ha.
keywords: nitrogen leaching, bioethanol, bioenergy crops
published: 2021-03-14
 
This dataset contains all the code, notebooks, datasets used in the study conducted to measure the spatial accessibility of COVID-19 healthcare resources with a particular focus on Illinois, USA. Specifically, the dataset measures spatial access for people to hospitals and ICU beds in Illinois. The spatial accessibility is measured by the use of an enhanced two-step floating catchment area (E2FCA) method (Luo & Qi, 2009), which is an outcome of interactions between demands (i.e, # of potential patients; people) and supply (i.e., # of beds or physicians). The result is a map of spatial accessibility to hospital beds. It identifies which regions need more healthcare resources, such as the number of ICU beds and ventilators. This notebook serves as a guideline of which areas need more beds in the fight against COVID-19. ## What's Inside A quick explanation of the components of the zip file * `COVID-19Acc.ipynb` is a notebook for calculating spatial accessibility and `COVID-19Acc.html` is an export of the notebook as HTML. * `Data` contains all of the data necessary for calculations: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * `Chicago_Network.graphml`/`Illinois_Network.graphml` are GraphML files of the OSMNX street networks for Chicago and Illinois respectively. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * `GridFile/` has hexagonal gridfiles for Chicago and Illinois &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * `HospitalData/` has shapefiles for the hospitals in Chicago and Illinois &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * `IL_zip_covid19/COVIDZip.json` has JSON file which contains COVID cases by zip code from IDPH &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * `PopData/` contains population data for Chicago and Illinois by census tract and zip code. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * `Result/` is where we write out the results of the spatial accessibility measures &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * `SVI/`contains data about the Social Vulnerability Index (SVI) * `img/` contains some images and HTML maps of the hospitals (the notebook generates the maps) * `README.md` is the document you're currently reading! * `requirements.txt` is a list of Python packages necessary to use the notebook (besides Jupyter/IPython). You can install the packages with `python3 -m pip install -r requirements.txt`
keywords: COVID-19; spatial accessibility; CyberGISX
published: 2021-05-14
 
This document contains the Supplemental Materials for Chapter 4: Climate Change Impacts on Agriculture from the report "An Assessment of the Impacts of Climate Change in Illinois" published in 2021.
keywords: Illinois; climate change; agriculture; impacts; adaptation; crop yield; ISAM; econometrics; days suitable for fieldwork
published: 2021-03-10
 
The PhytoplasmasRef_Trivellone_etal.fas fasta file contains the original final sequence alignment used in the phylogenetic analyses of Trivellone et al. (Ecology and Evolution, in review). The 27 sequences (21 phytoplasma reference strains and 6 phytoplasmas strains from the present study) were aligned using the Muscle algorithm as implemented in MEGA 7.0 with default settings. The final dataset contains 952 positions of the F2n/R2 fragment of the 16S rRNA gene. The data analyses are further described in the cited original paper.
keywords: Hemiptera; Cicadellidae; Mollicutes; Phytoplasma; biorepository
published: 2021-05-14
 
Please cite as: Jim Miller, Sergiusz Czesny, Qihong Dai, James Ellis, Louis Iverson, Jeff Matthews, Charles Roswell, Cory Suski, John Taft, and Mike Ward. 2021. “Climate Change Impacts on Ecosystems: Scientific and Common Species Names”.
keywords: Scientific names; Common names; Illinois species
published: 2021-05-14
 
Supplemental Forest Data for Chapter 6: Climate Change Impacts on Ecosystems in "An Assessment of the Impacts of Climate Change in Illinois"
published: 2021-04-19
 
Dataset compiled by Yushu Xia and Michelle Wander for the Soil Health Institute. Data were recovered from peer reviewed literature reporting results for three soil quality indicators (SQIs) (β-glucosidase (BG), fluorescein diacetate (FDA) hydrolysis, and permanganate oxidizable carbon (POXC)) in terms of their relative response to management where soils under grassland cover, no-tillage, cover crops, residue return and organic amendments were compared to conventionally managed controls. Peer-reviewed articles published between January of 1990 and May 2018 were searched using the Thomas Reuters Web of Science database (Thomas Reuters, Philadelphia, Pennsylvania) and Google Scholar to identify studies reporting results for: “β-glucosidase”, “permanganate oxidizable carbon”, “active carbon”, “readily oxidizable carbon”, and “fluorescein diacetate hydrolysis”, together with one or more of the following: “management practice”, “tillage”, “cover crop”, “residue”, “organic fertilizer”, or “manure”. Records were tabulated to compare SQI abundance in soil maintained under a control and soil aggrading practice with the intent to contribute to SQI databases that will support development of interpretive frameworks and/or algorithms including pedo-transfer functions relating indicator abundance to management practices and site specific factors. Meta-data include the following key descriptor variables and covariates useful for development of scoring functions: 1) identifying factors for the study site (location, year of initiation of study and year in which data was reported), 2) soil textural class, pH, and SOC, 3) depth and timing of soil sampling, 4) analytical methods for SQI quantification, 5) units used in published works (i.e. equivalent mass, concentration), 6) SQI abundances, and 7) statistical significance of difference comparisons. *Note: Blank values in tables are considered unreported data.
keywords: Soil health promoting practices; Soil quality indicators; β-glucosidase; fluorescein diacetate hydrolysis; Permanganate oxidizable carbon; Greenhouse gas emissions; Scoring curves; Soil Management Assessment Framework
published: 2021-05-09
 
Raw data and its analysis collected from a trial designed to test the impact of providing a Bacillus-based direct-fed microbial (DFM) on the syndrome resulting from orally infecting pigs with either Salmonella enterica serotype Choleraesuis (S. Choleraesuis) alone, or in combination with an intranasal challenge, three days later, with porcine reproductive and respiratory syndrome virus (PRRSV).
keywords: excel file
published: 2022-04-19
 
This data repository includes the features and the trained backbone parameters used in the ICLR 2022 Paper "On the Importance of Firth Bias Reduction in Few-Shot Classification". The code accompanying this data is open-source and available at https://github.com/ehsansaleh/firth_bias_reduction The code and the data have three modules: 1. The "code_firth" module (10 files) relates to the basic ResNet backbones and logistic classifiers (e.g., Figures 2 and 3 in the main paper). 2. The "code_s2m2rf" module (2 files) relates to the S2M2R feature backbones and cosine classifiers (e.g., Figure 4 in the main paper). 3. The "code_dcf" module (3 files) relates to the few-shot Distribution Calibration (DC) method (e.g., Table 1 in the main paper). The relevant files for each module have the module name as a prefix in their name. 1. For instance, the "code_dcf_features.tar" file should be placed at the "features" directory of the "code_dcf" module. 2. As another example, "code_firth_features_cifarfs_novel.tar" should be placed in the "features" directory of the "code_firth" module, and it includes the features extracted from the novel split of mini-ImageNet dataset. Each tar-ball should be extracted in its relevant directory, and the md5 check-sums of the extracted files are also provided in the open-source code repository for verification. Please note that the actual datasets of images are not included here (since we do not own those datasets). However, helper scripts for automatically downloading the original datasets are also provided in the every module and sub-directory of the GitHub code repository.
keywords: Computer Vision; Few-Shot Classification; Few-Shot Learning; Firth Bias Reduction
published: 2023-03-27
 
This dataset contains the full data used in the paper titled "Enabling High Precision Gradient Index Control in Subsurface Multiphoton Lithography," available at https://doi.org/10.1021/acsphotonics.2c01950 . The data used for Table 1 can be found in the dataset for the related Figure 8. Some supplemental figures' data can be found in the main figures data: Figure S2's data is contained in Figure 6. Figure S4 and Table S1 data is derived from Figure 6. Figure S9 is derived from Figure 7. Figure S10 is contained in Figure 7. Figure S12 is derived from Figure 6 and the Python code prism-fringe-analysis. Figures without a data file named after them do not have any data affiliated with them and are purely graphical representations.
published: 2020-07-16
 
Dataset to be for SocialMediaIE tutorial
keywords: social media; deep learning; natural language processing
published: 2020-08-01
 
This data set includes information used to determine patterns of mixing at three small confluences in East Central Illinois based on differences in the temperature or turbidity of the two confluent flows.
keywords: mixing; confluences; flow structure
published: 2020-10-15
 
This dataset consists of various input data that are used in the GAMS model. All the data are in the format of .inc which can be read within GAMS or Notepad. Main data sources include: acreage data (acre), crop budget data ($/acre), crop yield data (e.g. bushel/acre), Soil carbon sequestration data (KgCO2/ha/yr). Model details can be found in the "Assessing the Additional Carbon Savings with Biofuel" and GAMS model package. ## File Description (1) GAMS Model.zip: This includes all the input files and scripts for running the model (2) Table*.csv: These files include the data from the tables in the manuscript (3) Figure2_3_4.csv: This contains the data used to create the figures in the manuscript (4) BaselineResults.csv: This includes a summary of the model results. (5) SensitivityResults_*.csv: Model results from the various sensitivity analyses performed (6) LUC_emission.csv: land use change emissions by crop reporting district for changes of pasturelands to annual crops.
keywords: Biogenic carbon intensity; Corn ethanol; Economic model; Dynamic optimization; Anticipated baseline approach; Life cycle carbon intenisty
published: 2020-10-14
 
Data on permanent plots at Fortuna and the Panama Canal Watershed, Republic of Panama, containing counts and percent of trees with one or more multiple stems >10cm diameter, with and without palms. Accompanying environmental data includes elevation, precipitation, soil type and soil chemical variables (pH, total N, NO3, NO4, resin P, mehlich Ca, K and Mg.
keywords: multiple stems; resprouting; Panama Canal Watershed; Fortuna Forest Reserve