Displaying Dataset 26 - 50 of 321 in total

Subject Area

Life Sciences (164)
Social Sciences (74)
Physical Sciences (40)
Technology and Engineering (28)
Uncategorized (14)
Arts and Humanities (1)

Funder

U.S. National Science Foundation (NSF) (83)
Other (83)
U.S. National Institutes of Health (NIH) (34)
U.S. Department of Energy (DOE) (30)
U.S. Department of Agriculture (USDA) (16)
Illinois Department of Natural Resources (IDNR) (8)
U.S. National Aeronautics and Space Administration (NASA) (3)
U.S. Geological Survey (USGS) (3)
U.S. Army (1)

Publication Year

2020 (103)
2019 (74)
2018 (59)
2017 (35)
2016 (30)
2021 (20)

License

CC0 (185)
CC BY (133)
custom (3)
published: 2020-08-01
 
The Empoascini_morph_data.nex text file contains the original data used in the phylogenetic analyses of Xu et al. (Systematic Entomology, in review). The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first nine lines of the file indicate the file type (Nexus), that 110 taxa were analyzed, that a total of 99 characters were analyzed, the format of the data, and specification for symbols used in the dataset to indicate different character states. For species that have more than one state for a particular character, the states are enclosed in square brackets. Question marks represent missing data.The pdf file, Appendix1.pdf, is available here and describes the morphological characters and character states that were scored in the dataset. The data analyses are described in the cited original paper.
keywords: Hemiptera; Cicadellidae; morphology; biogeography; evolution
published: 2020-12-02
 
The dataset includes the survey results about farmers’ perceptions of marginal land availability and the likelihood of a land pixel being marginal based on a machine learning model trained from the survey. Two spreadsheet files are the farmer and farm characteristics (marginal_land_survey_data_shared.xlsx), and the existing land use of marginal lands (land_use_info_sharing.xlsx). <b>Note:</b> the blank cells in these two spreadsheets mean missing values in the survey response. The GeoTiff file includes two bands, one the marginal land likelihood in the Midwestern states (0-1), the other the dominant reason of land marginality (0-5; 0 for farm size, 1 for growing season precipitation, 2 for root zone soil water capacity, 3 for average slope, 4 for growing season mean temperature, and 5 for growing season diurnal range of temperature). To read the data, please use a GIS software such as ArcGIS or QGIS.
keywords: marginal land; survey
published: 2021-01-04
 
This dataset contains the emulated global multi-model urban climate projections under RCP 8.5 and RCP 4.5 used in the article "Global multi-model projections of local urban climates" (https://www.nature.com/articles/s41558-020-00958-8). Details about this dataset and the local urban climate emulator are described in the article. This dataset documents the monthly mean projections of urban temperatures and urban relative humidity of 26 CMIP5 Earth system models (ESMs) from 2006 to 2100 across the globe. This dataset may be useful for multiple communities regarding urban climate change, impacts, vulnerability, risks, and adaptation applications.
keywords: Urban climate; multi-model climate projections; CMIP; urban warming; heat stress
published: 2020-12-29
 
Three datasets: species_abundance_data, species_traits, and environmental_data. The three datasets were collected in the Fortuna Forest Reserve (8°45′ N, 82°15′ W) and Palo Seco Protected Forest (8°45′ N, 82°13′ W) located in western Panama. The two reserves support humid to super-humid rainforests, according to Holdridge (1947). The species_abundance_data and species_traits datasets were collected across 15 subplots of 25 m2 in 12 one-hectare permanent plots distributed across the two reserves. The subplots were spaced 20 m apart along three 5 m wide transects, each 30 m apart. Please read Prada et al. (2017) for details on the environmental characteristics of the study area. Prada CM, Morris A, Andersen KM, et al (2017) Soils and rainfall drive landscape-scale changes in the diversity and functional composition of tree communities in a premontane tropical forest. J Veg Sci 28:859–870. https://doi.org/10.1111/jvs.12540
keywords: functional traits; plants; ferns; environmental data; Fortuna; species data; community ecology
published: 2020-12-16
 
Terrorism is among the most pressing challenges to democratic governance around the world. The Responsible Terrorism Coverage (or ResTeCo) project aims to address a fundamental dilemma facing 21st century societies: how to give citizens the information they need without giving terrorists the kind of attention they want. The ResTeCo hopes to inform best practices by using extreme-scale text analytic methods to extract information from more than 70 years of terrorism-related media coverage from around the world and across 5 languages. Our goal is to expand the available data on media responses to terrorism and enable the development of empirically-validated models for socially responsible, effective news organizations. This particular dataset contains information extracted from terrorism-related stories in the Summary of World Broadcasts published between 1979 and 2019. It includes variables that measure the relative share of terrorism-related topics, the valence and intensity of emotional language, as well as the people, places, and organizations mentioned. This dataset contains 3 files: 1. "ResTeCo Project SWB Dataset Variable Descriptions.pdf" A detailed codebook containing a summary of the Responsible Terrorism Coverage (ResTeCo) Project BBC Summary of World Broadcasts (SWB) Dataset and descriptions of all variables. 2. "resteco-swb.csv" This file contains the data extracted from terrorism-related media coverage in the BBC Summary of World Broadcasts (SWB) between 1979 and 2019. It includes variables that measure the relative share of topics, sentiment, and emotion present in this coverage. There are also variables that contain metadata and list the people, places, and organizations mentioned in these articles. There are 53 variables and 438,373 observations. The variable "id" uniquely identifies each observation. Each observation represents a single news article. Please note that care should be taken when using "resteco-swb.csv". The file may not be suitable to use in a spreadsheet program like Excel as some of the values get to be quite large. Excel cannot handle some of these large values, which may cause the data to appear corrupted within the software. It is encouraged that a user of this data use a statistical package such as Stata, R, or Python to ensure the structure and quality of the data remains preserved. 3. "README.md" This file contains useful information for the user about the dataset. It is a text file written in markdown language Citation Guidelines 1) To cite this codebook please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project BBC Summary of World Broadcasts (SWB) Dataset Variable Descriptions. Responsible Terrorism Coverage (ResTeCo) Project BBC Summary of World Broadcasts (SWB) Dataset. Cline Center for Advanced Social Research. December 16. University of Illinois Urbana-Champaign. doi: https://doi.org/10.13012/B2IDB-2128492_V1 2) To cite the data please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project Summary of World Broadcasts (SWB) Dataset. Cline Center for Advanced Social Research. December 16. University of Illinois Urbana-Champaign. doi: https://doi.org/10.13012/B2IDB-2128492_V1
keywords: Terrorism, Text Analytics, News Coverage, Topic Modeling, Sentiment Analysis
published: 2020-12-16
 
Terrorism is among the most pressing challenges to democratic governance around the world. The Responsible Terrorism Coverage (or ResTeCo) project aims to address a fundamental dilemma facing 21st century societies: how to give citizens the information they need without giving terrorists the kind of attention they want. The ResTeCo hopes to inform best practices by using extreme-scale text analytic methods to extract information from more than 70 years of terrorism-related media coverage from around the world and across 5 languages. Our goal is to expand the available data on media responses to terrorism and enable the development of empirically-validated models for socially responsible, effective news organizations. This particular dataset contains information extracted from terrorism-related stories in the Foreign Broadcast Information Service (FBIS) published between 1995 and 2013. It includes variables that measure the relative share of terrorism-related topics, the valence and intensity of emotional language, as well as the people, places, and organizations mentioned. This dataset contains 3 files: 1. "ResTeCo Project FBIS Dataset Variable Descriptions.pdf" A detailed codebook containing a summary of the Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset and descriptions of all variables. 2. "resteco-fbis.csv" This file contains the data extracted from terrorism-related media coverage in the Foreign Broadcast Information Service (FBIS) between 1995 and 2013. It includes variables that measure the relative share of topics, sentiment, and emotion present in this coverage. There are also variables that contain metadata and list the people, places, and organizations mentioned in these articles. There are 53 variables and 750,971 observations. The variable "id" uniquely identifies each observation. Each observation represents a single news article. Please note that care should be taken when using "resteco-fbis.csv". The file may not be suitable to use in a spreadsheet program like Excel as some of the values get to be quite large. Excel cannot handle some of these large values, which may cause the data to appear corrupted within the software. It is encouraged that a user of this data use a statistical package such as Stata, R, or Python to ensure the structure and quality of the data remains preserved. 3. "README.md" This file contains useful information for the user about the dataset. It is a text file written in mark down language Citation Guidelines 1) To cite this codebook please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset Variable Descriptions. Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset. Cline Center for Advanced Social Research. December 16. University of Illinois Urbana-Champaign. doi: https://doi.org/10.13012/B2IDB-6360821_V1 2) To cite the data please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset. Cline Center for Advanced Social Research. December 16. University of Illinois Urbana-Champaign. doi: https://doi.org/10.13012/B2IDB-6360821_V1
keywords: Terrorism, Text Analytics, News Coverage, Topic Modeling, Sentiment Analysis
published: 2020-12-16
 
The Cline Center Global News Index is a searchable database of textual features extracted from millions of news stories, specifically designed to provide comprehensive coverage of events around the world. In addition to searching documents for keywords, users can query metadata and features such as named entities extracted using Natural Language Processing (NLP) methods and variables that measure sentiment and emotional valence. Archer is a web application purpose-built by the Cline Center to enable researchers to access data from the Global News Index. Archer provides a user-friendly interface for querying the Global News Index (with the back-end indexing still handled by Solr). By default, queries are built using icons and drop-down menus. More technically-savvy users can use Lucene/Solr query syntax via a ‘raw query’ option. Archer allows users to save and iterate on their queries, and to visualize faceted query results, which can be helpful for users as they refine their queries. <b>Additional Resources:</b> - Access to Archer and the Global News Index is limited to account-holders. If you are interested in signing up for an account, you can fill out the <a href="https://forms.gle/oaUWRSSCkqKxyY5T7"><b>Archer User Information Form</b></a>. - Current users who would like to provide feedback, such as reporting a bug or requesting a feature, can fill out the <a href="https://forms.gle/6eA2yJUGFMtj5swY7"><b>Archer User Feedback Form</b></a>. - The Cline Center sends out periodic email newsletters to the Archer Users Group. Please fill out this form to <a href="https://groups.webservices.illinois.edu/subscribe/123172"><b>subscribe to Archer Users Group</b></a>. <b>Citation Guidelines:</b> 1) To cite the GNI codebook (or any other documentation associated with the Global News Index and Archer) please use the following citation: Cline Center for Advanced Social Research. 2020. Global News Index and Extracted Features Repository [codebook], v1.0.1. Champaign, IL: University of Illinois. Dec. 16. doi:10.13012/B2IDB-5649852_V2 2) To cite data from the Global News Index (accessed via Archer or otherwise) please use the following citation (filling in the correct date of access): Cline Center for Advanced Social Research. 2020. Global News Index and Extracted Features Repository [database], v1.0.1. Champaign, IL: University of Illinois. Dec. 16. Accessed Month, DD, YYYY. doi:10.13012/B2IDB-5649852_V2
keywords: Cline Center; Cline Center for Advanced Social Research; political; social; political science; Global News Index; Archer; news; mass communication; journalism;
published: 2020-12-15
 
The dataset consists of results and various input data that are used in the GAMS model for the publication "Repeal of the Clean Power Plan: Social Cost and Distributional Implications". All the data are either excel files or in the .inc format which can be read within GAMS or Notepad. Main data sources include: agriculture, transportation and electricity data. Model details can be found in the paper and the GAMS model package.
keywords: carbon abatement; welfare cost; electricity sector; partial equilibrium model
published: 2020-12-14
 
Femoral skeletal traits (cross-sectional properties, maximum distal metaphyseal breadth of the femur, and maximum superior/inferior femoral head diameter) of 219 Taiwanese subadult individuals (aged 0 to 17) as used in the manuscript "Allometric scaling and growth: evaluation and applications in subadult body mass estimation."
keywords: femur; cross-sectional geometry; osteometrics; subadult
published: 2020-04-22
 
Data on Croatian restaurant allergen disclosures on restaurant websites, on-line menus and social media comments
keywords: restaurant; allergen; disclosure; tourism
published: 2020-12-12
 
Dataset associated with Jones et al FE-2019-01175 submission: Does the size and developmental stage of traits at fledging reflect juvenile flight ability among songbirds? Excel CSV files with all of the data used in analyses and file with descriptions of each column. The flight ability variable in this dataset was derived from fledgling drop tests, examples of which can be found in the related dataset: Jones, Todd M.; Benson, Thomas J.; Ward, Michael P. (2019): Flight Ability of Juvenile Songbirds at Fledgling: Examples of Fledgling Drop Tests. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2044905_V1.
keywords: body condition; fledgling; flight ability; locomotor ability; post-fledging; songbirds; wing development; wing emergence
published: 2020-12-07
 
This page contains the data for the publication "Regulation of growth and cell fate during tissue regeneration by the two SWI/SNF chromatin-remodeling complexes of Drosophila" published in Genetics, 2020
published: 2020-12-03
 
This small dataset is a raw data of anthropometric and dietary intake data.
keywords: Obesity treatment; weight management; high protein; high fiber; nonrestrictive; data visualization; self-empowerment; informed decision making
published: 2020-12-01
 
This is the data set from the published manuscript 'Vertebrate scavenger guild composition and utilization of carrion in an East Asian temperate forest' by Inagaki et al.
keywords: Japan;Sika Deer
published: 2020-11-20
 
This data set explores the effect of the cyanobacterial gene ictB on photosynthesis in sorghum, under both normal greenhouse growing temperatures (32 C / 25 C) and during and after an 8 day chilling stress (10 C / 5 C). IctB is a cyanobacterial gene of unknown function, which was initially thought to be involved in inorganic carbon transport into cells. While ictB is known now not to be an independently active carbon transporter in its own right, it may play a role in passive diffusion of metabolites. This transgene was introduced into sorghum by the lab of Thomas Clemente, through Agrobacterium mediated transformation, alone and in combination with the tomato sedoheptulose-1,7-bisphosphatase (SBPase) gene. Eleven events (six double construct and five single construct ictB) were involved in this study. SBPase was included because some previous experiments in C3 species and some previous modeling work, as well as its position at a metabolic branch point, indicates it plays a role as a control point for photosynthesis. A chilling treatment was included because chilling is one of the most serious ecological factors limiting the range of C4 species. Data includes gene expression, metabolomics (at normal growing temperature), SBPase enzyme activity, biomass and photosynthetic traits at both warm temperature and during and after chilling stress. ----------------- EXPLANATORY NOTES FOR ICTB/SBPASE SORGHUM MANUSCRIPT Data are organized into 10 worksheets, representing an expected 10 tables that will serve a supplementary role in the final publication. These include data on gene expression, metabolomics (at normal growing temperature), SBPase enzyme activity, biomass and photosynthetic traits at both warm temperature and during and after chilling stress. <i><b>Tables are as follows:</i></b> 1. Event_Code: for Table S1. Event codes for events and constructs. Two constructs were generated for this study, and numerous transgenic “events” (i.e. independent transformations) were carried out for each construct. A construct represents the actual vector which was introduced into the plants (complete with promoter, gene of interest, marker gene, etc.) while an event represents a single successful introduction of the transgene. Events are uniquely labeled with letter and number strings but also with a four-digit number for ease of reference, this table explains which event corresponds to each four-digit number. 2. Photosynthetic_Data: for Table S2. Photosynthetic data at greenhouse growing temperature, for ictB single construct, ictB/SBPase double construct, and wild type lines. Five ictB and six ictB/SBPase events were included. Greenhouse growing temperature was approximately 32 °C and 25 °C night. Photosynthetic parameters were measured using a Licor 6400-XT, and included parameters related to carbon dioxide uptake, water loss, and chlorophyll fluorescence. 3. Chilling_Treatment: for Table S3. Photosynthetic response to chilling treatment, for ictB single construct, and wild type lines. Four ictB events were included. Chilling treatment lasted approximately 8 days and began either 3.5 or 5.5 weeks after transplanting the plants (chilling was done in two batches). Chilling treatment involved temperatures of 10 °C day / 7 °C night in growth chambers. Photosynthetic parameters were measured at several time points during and after the chilling treatment, were measured using a Licor 6400-XT, and included parameters related to carbon dioxide uptake, water loss, and chlorophyll fluorescence. 4. SBPase_Activity: for Table S4. SBPase activity in double construct plants. These data measure in vitro substrate-saturated activity of SBPase in desalted extracts from leaf tissues, at 25 °C. Units are micromoles of SBP processed per second per m2 of leaf tissue. Five ictB/SBPase events were included. 5. 2014_gene_exp: for Table S5. Gene expression in 2014 experiment (units of cycle times). These data measure cycle times to threshold, relative to reference genes, for expression of ictB and SBPase. Six ictB single construct events and five ictB/SBPase double construct events were included. Cycle times to threshold relative to reference genes (ΔCT) are inversely related to number of transcripts relative to reference genes, as follows: ΔCT = -log2([NictB]/[Nreference])/[1 + log2b] where b = efficiency of replication. 6. 2016_gene_exp: for Table S5. Gene expression in 2016 experiment (units of cycle times). These data measure cycle times to threshold, relative to reference genes, for expression of ictB and SBPase. Six ictB single construct events and five ictB/SBPase double construct events were included. Cycle times to threshold relative to reference genes (ΔCT) are inversely related to number of transcripts relative to reference genes, as follows: ΔCT = -log2([NictB]/[Nreference])/[1 + log2b] where b = efficiency of replication. 7. Metabolites: for Table S7. Levels of 267 metabolites in leaf tissue. Four ictB single construct events and four ictB/SBPase double construct events were included in these analyses. Metabolites were measured in methanol-extracted samples, either by liquid chromatography / mass spectrometry or by gas chromatography / mass spectrometry, and were compared between events on a relative basis. As quantification was relative to wild type rather than on an absolute basis, no units are included. 8. Metabolite_F_values: for Table S8. F values for effects of ictB, SBPase (in cases where the model was better with a SBPase effect) and event. These analyses are done for each metabolite included in Table S7, and show effects of the explanatory variables ictB, SBPase, and individual event. 9. Biomass_2020: for Table S9. Biomass and grain yield at harvest, for ictB, ictB/SBPase and wild type sorghum plants in spring 2020. Four ictb/SBPase double construct and four ictB single construct events were included. 10. Biomass_2017: for Table S10. Biomass and grain yield at harvest, in chilled and non-chilled sorghum plants containing the ictB transgene (along with wild type controls) in fall 2017. Four ictB single construct events were included. Chilling treatment involved temperatures of 10 °C day / 7 °C night in growth chambers. <i><b>All the variables in the file are explained as below:</i></b> o Type (IctB-SBPase and IctB). This refers to whether a plant is wild type, single construct (contains only the ictB transgene) or double construct (contains both the ictB and SBPase transgenes). o Code: these codes are shorter labels to refer to each transgene event for the sake of convenience. o Alternate_Code: these codes are shorter labels to refer to each transgene event for the sake of convenience. o Event Number: these are unique labels for each transgenic events. o Construct Number: these are labels for each transgenic construct (either the ictB single construct or the ictB/SBPase double construct). o year (i): this refers to the year in which the study was conducted (2014, 2016, 2017, or 2020) o transgene or Transgenic: whether the transgene was present o construct or Type : whether the ictB or the ictB/SBPase construct was present (double, single, wildtype): o temp: leaf temperature during the measurement o A: carbon assimilation rate, in μmol m-2 s-1 o gs: stomatal conductance, in mol m-2 s-1 o CI: intercellular carbon dioxide concentration, in parts per million or μL L-1 o fvfm:FV’/FM’ (maximal potential photosystem II quantum yield under light adapted conditions), dimensionless ratio o phipsill: ΦPSII (maximal potential photosystem II quantum yield under light adapted conditions), dimensionless ratio o qP: photochemical quenching, i.e. ratio of ΦPSII to FV’/FM’ , dimensionless ratio o iwue: intrinsic water use efficiency, i.e. ratio of carbon assimilation rate to stomatal conductance, in units of μmol mol-1 o event: individual transgenic / transformation event o Vmax: substrate-saturated in vitro activity of the SBPase enzyme, in μmol m-2 s-1 o ID: identification number of sample o ΔCT1: difference in cycle times to threshold during gene expression (quantitative PCR) assay, between ictB and the reference gene GAPDH, in units of cycles o ΔCT2: cycle times to threshold during gene expression (quantitative PCR) assay, between SBPase and the reference gene GAPDH, in units of cycles o GAPDH: cycle times to threshold for the reference gene GAPDH (glyceraldehyde phosphate dehydrogenase) o IctB: cycle times to threshold for the gene of interest ictB o SBPase: cycle times to threshold for the gene of interest SBPase o v1 to v267 represent individual metabolite (see the heading immediately above the labels v1, v2, etc.). Variables v268-v272 refer to total (summed) metabolite levels for particular pathways of interest. o leaf: Leaf and stem dry biomass (in grams) o seed: Seedhead dry biomass (in grams) o biomass: Total (leaf, stem + seed head) dry biomass (in grams) o harvind: ratio of seed head dry biomass to total dry biomass o treatment (chilled and nonchilled): “Chilled” plants were grown under warm greenhouse conditions (32 °C day / 25 °C night) for 6 or 8 weeks, then switched to chilling temperatures under growth chamber conditions (10 °C / 7 °C night) for 8 days, and were then returned to greenhouse growing conditions. -----------------
keywords: ictB; SBPase; photosynthesis; sorghum; chilling
published: 2020-11-25
 
Video recorded by Louise Barker using a Cannon Powershot camera documents late-season combat behavior in Agkistrodon contortrix. Recorded in Beaufort County, North Carolina, 11.1 km SE of downtown Washington on 21 October 2020.
keywords: Agkistrodon contortrix; combat; mating; reproduction; copperhead; pit viper; Viperidae;
published: 2020-11-18
 
These data obtained from the peer-reviewed literature and a public database depict the geographic expansion of the black-legged tick (Ixodes scapularis) and human cases of Lyme disease in the midwestern U.S. <b><i>Note</b></i>: There was an omission from the first version (V1) of the data set that required us to update the data. Specifically, we failed to include the data from the article "Caporale DA, Johnson CM, Millard BJ. 2005 Presence of Borrelia burgdorferi (Spirochaetales: Spirochaetaceae) in Southern Kettle Moraine State Forest, Wisconsin, and characterization of strain W97F51. J. Med. Entomol. 42, 457–472". In the second version (V2) of the data, this omission is corrected.
keywords: Lyme disease; Borrelia burgdorferi; Ixodes scapularis; black-legged tick
published: 2020-11-18
 
This is the dataset that accompanies the paper titled "A Dual-Frequency Radar Retrieval of Snowfall Properties Using a Neural Network", submitted for peer review in August 2020. Please see the github for the most up-to-date data after the revision process: https://github.com/dopplerchase/Chase_et_al_2021_NN Authors: Randy J. Chase, Stephen W. Nesbitt and Greg M. McFarquhar Corresponding author: Randy J. Chase (randyjc2@illinois.edu) Here we have the data used in the manuscript. Please email me if you have specific questions about units etc. 1) DDA/GMM database of scattering properties: base_df_DDA.csv This is the combined dataset from the following papers: Leinonen & Moisseev, 2015; Leinonen & Szyrmer, 2015; Lu et al., 2016; Kuo et al., 2016; Eriksson et al., 2018. The column names are D: Maximum dimension in meters, M: particle mass in grams kg, sigma_ku: backscatter cross-section at ku in m^2, sigma_ka: backscatter cross-section at ka in m^2, sigma_w: backscatter cross-section at w in m^2. The first column is just an index column. 2) Synthetic Data used to train and test the neural network: Unrimed_simulation_wholespecturm_train_V2.nc, Unrimed_simulation_wholespecturm_test_V2.nc This was the result of combining the PSDs and DDA/GMM particles randomly to build the training and test dataset. 3) Notebook for training the network using the synthetic database and Google Colab (tensorflow): Train_Neural_Network_Chase2020.ipynb This is the notebook used to train the neural network. 4)Trained tensorflow neural network: NN_6by8.h5 This is the hdf5 tensorflow model that resulted from the training. You will need this to run the retrieval. 5) Scalers needed to apply the neural network: scaler_X_V2.pkl, scaler_y_V2.pkl These are the sklearn scalers used in training the neural network. You will need these to scale your data if you wish to run the retrieval. 6) <b>New in this version</b> - Example notebook of how to run the trained neural network on Ku- Ka- band observations. We showed this with the 3rd case in the paper: Run_Chase2021_NN.ipynb 7) <b>New in this version</b> - APR data used to show how to run the neural network retrieval: Chase_2021_NN_APR03Dec2015.nc The data for the analysis on the observations are not provided here because of the size of the radar data. Please see the GHRC website (<a href="https://ghrc.nsstc.nasa.gov/home/">https://ghrc.nsstc.nasa.gov/home/</a>) if you wish to download the radar and in-situ data or contact me. We can coordinate transferring the exact datafiles used. The GPM-DPR data are avail. here: <a href="http://dx.doi.org/10.5067/GPM/DPR/GPM/2A/05">http://dx.doi.org/10.5067/GPM/DPR/GPM/2A/05</a>
published: 2020-11-14
 
Dataset includes temperature data (local average April daily temperatures), first egg dates and reproductive output of Prothonotary Warblers breeding in southernmost Illinois, USA. Also included are arrival dates for warblers returning to breeding grounds from wintering grounds, and global temperature anomaly data for comparison with local temperatures. These data were used in the manuscript entitled "Warmer April Temperatures on Breeding Grounds Promote Earlier Nesting in a Long-Distance Migratory Bird, the Prothonotary Warbler" published in Frontiers in Ecology and Evolution. A rich text file is included with explanations of each variable in the dataset.
keywords: first egg dates; global warming; local temperature effects; long-distance migratory bird; prothonotary warbler; protonotaria citrea; reproductive output
published: 2020-11-06
 
This data contains bam files and transcripts in the simulated instances generated for the paper 'JUMPER: Discontinuous Transcript Assembly in SARS-CoV-2' submitted for RECOMB 2021. The folder 'bam' contained the simulated bam files aligned using STAR wile the reads were generated using the method polyester Note: in the readme file, close to the end of the document, please ignore this sentence: 'Those files can be opened by using [name of software].'
keywords: transcript assembly; SARS-CoV-2; discontinuous transcription; coronaviruses
published: 2020-11-05
 
This version 2 dataset contains 34 files in total with one (1) additional file, called "Culture-dependent Isolate table with taxonomic determination and sequence data.csv". The remaining files (33) are identical to version 1. The following is the information about the new file and its variables: <b>Culture-dependent Isolate table with taxonomic determination and sequence data.csv</b>: Culture table with assigned taxonomy from NCBI. Single direction sequence for each isolate is include if one could be obtained. Sequence is derived from ITS1F-ITS4 PCR amplicons, with Sanger sequencing in one direction using ITS5. The files contains 20 variables with explanation as below: IsolateNumber : unique number identify each isolate cultured Time: season in which the sample was collected Location: the specific name of the location Habitat: type of habitat : either stream or peatland State: state in the USA in which the specific location is located Incubation_pH ID: pH of the medium during isolation of fungal cultures Genus: phylogenetic genus of the fungal isolates (determined by sequence similarity) Sequence_quality: base call quality of the entire sequence used for blast analysis, if known %_coverage: sequence coverage reported from GenBank %_ID: sequence similarity reported from GenBank Life_style : ecological life style if known Phylum: phylogenetic phylum as indicated by Index Fungorum Subphylum: phylogenetic subphylum as indicated by Index Fungorum Class: phylogenetic class as indicated by Index Fungorum Subclass: phylogenetic subclass as indicated by Index Fungorum Order: phylogenetic order as indicated by Index Fungorum Family: phylogenetic Family as indicated by Index Fungorum ITS5_Sequence: single direction sequence used for sequence similarity match using blastn. Primer ITS5 Fasta: sequence with nomenclature in a fasta format for easy cut and paste into phylogenetic software Note: blank cells mean no data is available or unknown.
keywords: ITS1 forward reads; Illumina; peatlands; streams; bogs; fens
published: 2020-07-15
 
This repository includes scripts and datasets for Chapter 6 of my PhD dissertation, " Supertree-like methods for genome-scale species tree estimation," that had not been published previously. This chapter is based on the article: Molloy, E.K. and Warnow, T. "FastMulRFS: Fast and accurate species tree estimation under generic gene duplication and loss models." Bioinformatics, In press. https://doi.org/10.1093/bioinformatics/btaa444. The results presented in my PhD dissertation differ from those in the Bioinformatics article, because I re-estimated species trees using FastMulRF and MulRF on the same datasets in the original repository (https://doi.org/10.13012/B2IDB-5721322_V1). To re-estimate species trees, (1) a seed was specified when running MulRF, and (2) a different script (specifically preprocess_multrees_v3.py from https://github.com/ekmolloy/fastmulrfs/releases/tag/v1.2.0) was used for preprocessing gene trees (which were then given as input to MulRF and FastMulRFS). Note that this preprocessing script is a re-implementation of the original algorithm for improved speed (a bug fix also was implemented). Finally, it was brought to my attention that the simulation in the Bioinformatics article differs from prior studies, because I scaled the species tree by 10 generations per year (instead of 0.9 years per generation, which is ~1.1 generations per year). I re-simulated datasets (true-trees-with-one-gen-per-year-psize-10000000.tar.gz and true-trees-with-one-gen-per-year-psize-50000000.tar.gz) using 0.9 years per generation to quantify the impact of this parameter change (see my PhD dissertation or the supplementary materials of Bioinformatics article for discussion).
keywords: Species tree estimation; gene duplication and loss; statistical consistency; MulRF, FastRFS
published: 2020-08-01
 
This data set shows how density effects have an important influence on mixing at a small river confluence. The data consist of results of simulations using a detached eddy simulation model.
keywords: confluence; flow dynamics; density effects
published: 2020-08-25
 
The Allan Lab has published a Fluidigm pipeline online. This is the url: https://github.com/HPCBio/allan-fluidigm-pipeline. This url includes a tutorial for running the pipeline. However it does not have test datasets yet. This tarball hosted at the Illinois Data Bank is the dataset that completes the github tutorial. It includes inputs (custom database of tick pathogens and fluidigm raw reads) and output files (tables of samples with taxonomic classifications).
keywords: custom database of tick pathogens; fluidigm pipeline; fluidigm paired reads; fluidigm tutorial
published: 2020-08-31
 
This dataset contains BEPAM model code and input data to replicate the outcomes for "The Economic and Environmental Costs and Benefits of the Renewable Fuel Standard". The dataset consists of: (1) The replication codes and data for the BEPAM model. The code file is named as output.gms. (BEPAM-Social cost model-ERL.zip) (2) Simulation results from the BEPAM model (BEPAM_Simulation_Results.csv) * Item (1) is in GAMS format. Item (2) is in text format.
keywords: Social Cost of Carbon; Social Cost of Nitrogen; Cost-Benefit Analysis; Indirect Land-Use Change