planned publication date: 2022-01-01

The file “Fla.fasta”, comprising 10526 positions, is the concatenated amino acid alignments of 51 orthologues of 182 bacterial strains. It was used for the maximum likelihood and maximum parsimony analyses of Flavobacteriales. Bacterial species names and strains were used as the sequence names, host names of insect endosymbionts were shown in brackets. The file “16S.fasta” is the alignment of 233 bacterial 16S rRNA sequences. It contains 1455 positions and was used for the maximum likelihood analysis of flavobacterial insect endosymbionts. The names of endosymbiont strains were replaced by the name of their hosts. In addition to the species names, National Center for Biotechnology Information (NCBI) accession numbers were also indicated in the sequence names (e.g., sequence “Cicadellidae_Deltocephalinae_Macrostelini_Macrosteles_striifrons_AB795320” is the 16S rRNA of Macrosteles striifrons (Cicadellidae: Deltocephalinae: Macrostelini) with a NCBI accession number AB795320). The file “Sulcia_pep.fasta” is the concatenated amino acid alignments of 131 orthologues of “Candidatus Sulcia muelleri” (Sulcia). It contains 41970 positions and presents 101 Sulcia strains and 3 Blattabacterium strains. This file was used for the maximum likelihood analysis of Sulcia. The file “Sulcia_nucleotide.fasta” is the concatenated nucleotide alignment corresponding to the sequences in “Sulcia_pep.fasta” but also comprises the alignment of 16S rRNA. It has 127339 positions and was used for the maximum likelihood and maximum parsimony analyses of Sulcia. Individual gene alignments (16S rRNA and 131 orthologues of Sulcia and Blattabacterium) are deposited in the compressed file “individual_gene_alignments.zip”, which were used to construct gene trees for multispecies coalescent analysis. The names of Sulcia strains were replaced by the name of their hosts in “Sulcia_pep.fasta”, “Sulcia_nucleotide.fasta” and the files in “individual_gene_alignments.zip”. In all the alignment files, gaps are indicated by “-”.
keywords: endosymbiont, “Candidatus Sulcia muelleri”, Auchenorrhyncha, coevolution
published: 2016-05-16

This dataset contains the protein sequences and trees used to compare Non-Ribosomal Peptide Synthetase (NRPS) condensation domains in the AMB gene cluster and was used to create figure S1 in Rojas et al. 2015. Instead of having to collect representative sequences independently, this set of condensation domain sequences may serve as a quick reference set for coarse classification of condensation domains.
keywords: NRPS; biosynthetic gene cluster; antimetabolite; Pseudomonas; oxyvinylglycine; secondary metabolite; thiotemplate; toxin
published: 2021-06-24

This dataset consists of the secondary ion mass spectrometry (SIMS) depth profiling data that was collected with a Cameca NanoSIMS 50 instrument from a 10 micron by 10 micron region on a Madin-Darby canine kidney (MDCK) cell that had been metabolically labeled so most of its sphingolipids and cholesterol contained the rare nitrogen-15 oxygen-18 isotopes, respectively.
keywords: secondary ion mass spectrometry; NanoSIMS; depth profiling; MDCK cell; sphingolipids; cholesterol
published: 2021-06-14

This repository contains the weights for two StyleGAN2 networks trained on two composite T1 and T2 weighted open-source brain MR image datasets, and one StyleGAN2 network trained on the Flickr Face HQ image dataset. Example images sampled from the respective StyleGANs are also included. The datasets themselves are not included in this repository. The weights are stored as .pkl files. The code and instructions to load and use the weights can be found at https://github.com/comp-imaging-sci/pic-recon . Additional details and citations can be found in the file "README.md".
keywords: StyleGAN2; Generative adversarial network (GAN); MRI; Medical imaging
published: 2021-06-16

Thank you for using these datasets. These RNAsim aligned fragmentary sequences were generated from the query sequences selected by Balaban et al. (2019) in their variable-size datasets (https://doi.org/10.5061/dryad.78nf7dq). They were created for use for phylogenetic placement with the multiple sequence alignments and backbone trees provided by Balaban et al. (2019). The file structures included here also correspond with the data Balaban et al. (2020) provided. This includes: Directories for five varying backbone tree sizes, shown as 5000, 10000, 50000, 100000, and 200000. These directory names are also used by Balaban et al. (2019), and indicate the size of the backbone tree included in their data. Subdirectories for each replicate from the backbone tree size labelled 0 through 4. For the smaller four backbone tree sizes there are five replicates, and for the largest there is one replicate. Each replicate contains 200 text files with one aligned query sequence fragment in fasta format.
keywords: Fragmentary Sequences; RNAsim
published: 2021-06-17

Model output dataset (6-hourly) from the Weather Research and Forecasting (WRF) model simulations over South America with the added capability of water vapor tracers to track the moisture that originates over the Amazon and the La Plata river basins. The simulations were performed for the period 2003-2013 at 20-km horizontal resolution fully coupled with the Noah-MP land surface model. Limited number of original output variables sufficient for reproducing the analyses in papers that cite this dataset are included here. The attached wrfout_southamerica_readme.txt contains detailed information about the file format and variables. For the complete model dataset, contact francina@illinois.edu.
keywords: WRF; Amazon; La Plata; South America; Numerical tracers
published: 2020-05-04

The Cline Center Historical Phoenix Event Data covers the period 1945-2019 and includes 8.2 million events extracted from 21.2 million news stories. This data was produced using the state-of-the-art PETRARCH-2 software to analyze content from the New York Times (1945-2018), the BBC Monitoring's Summary of World Broadcasts (1979-2019), the Wall Street Journal (1945-2005), and the Central Intelligence Agency’s Foreign Broadcast Information Service (1995-2004). It documents the agents, locations, and issues at stake in a wide variety of conflict, cooperation and communicative events in the Conflict and Mediation Event Observations (CAMEO) ontology. The Cline Center produced these data with the generous support of Linowes Fellow and Faculty Affiliate Prof. Dov Cohen and help from our academic and private sector collaborators in the Open Event Data Alliance (OEDA). For details on the CAMEO framework, see: Schrodt, Philip A., Omür Yilmaz, Deborah J. Gerner, and Dennis Hermreck. "The CAMEO (conflict and mediation event observations) actor coding framework." In 2008 Annual Meeting of the International Studies Association. 2008. http://eventdata.parusanalytics.com/papers.dir/APSA.2005.pdf Gerner, D.J., Schrodt, P.A. and Yilmaz, O., 2012. Conflict and mediation event observations (CAMEO) Codebook. http://eventdata.parusanalytics.com/cameo.dir/CAMEO.Ethnic.Groups.zip For more information about PETRARCH and OEDA, see: http://openeventdata.org/
keywords: OEDA; Open Event Data Alliance (OEDA); Cline Center; Cline Center for Advanced Social Research; civil unrest; petrarch; phoenix event data; violence; protest; political; conflict; political science
published: 2019-08-29

This is part of the Cline Center’s ongoing Social, Political and Economic Event Database Project (SPEED) project. Each observation represents an event involving civil unrest, repression, or political violence in Sierra Leone, Liberia, and the Philippines (1979-2009). These data were produced in an effort to describe the relationship between exploitation of natural resources and civil conflict, and to identify policy interventions that might address resource-related grievances and mitigate civil strife. This work is the result of a collaboration between the US Army Corps of Engineers’ Construction Engineer Research Laboratory (ERDC-CERL), the Swedish Defence Research Agency (FOI) and the Cline Center for Advanced Social Research (CCASR). The project team selected case studies focused on nations with a long history of civil conflict, as well as lucrative natural resources. The Cline Center extracted these events from country-specific articles published in English by the British Broadcasting Corporation (BBC) Summary of World Broadcasts (SWB) from 1979-2008 and the CIA’s Foreign Broadcast Information Service (FBIS) 1999-2004. Articles were selected if they mentioned a country of interest, and were tagged as relevant by a Cline Center-built machine learning-based classification algorithm. Trained analysts extracted nearly 10,000 events from nearly 5,000 documents. The codebook—available in PDF form below—describes the data and production process in greater detail.
keywords: Cline Center for Advanced Social Research; civil unrest; Social Political Economic Event Dataset (SPEED); political; event data; war; conflict; protest; violence; social; SPEED; Cline Center; Political Science
published: 2021-02-23

Coups d'état are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup D'état Project (CDP) as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e. realized or successful coups, unrealized coup attempts, or thwarted conspiracies) the type of actor(s) who initiated the coup (i.e. military, rebels, etc.), as well as the fate of the deposed leader. This is version 2.0.0 of this dataset. The first version, <a href="https://clinecenter.illinois.edu/project/research-themes/democracy-and-development/coup-detat-project-cdp ">v.1.0.0</a>, was released in 2013. Since then, the Cline Center has taken several steps to improve on the previously-released data. These changes include: <ol> <li>Filling in missing event data values</li> <li>Removing events with no identifiable dates</li> <li>Reconciling event dates from sources that have conflicting information</li> <li>Removing events with insufficient sourcing (each event now has at least two sources)</li> <li>Removing events that were inaccurately coded and did not meet our definition of a coup event</li> <li>Extending the time period covered from 1945-2005 to 1945-2019</li> <li>Removing certain variables that fell below the threshold of inter-coder reliability required by the project</li> <li>The spreadsheet ‘CoupInventory.xls’ was removed because of inadequate attribution and citation in the event summaries</li></ol> <b>Items in this Dataset</b> 1. <i>CDP v.2.0.2 Codebook.pdf</i> <ul><li>This 14-page document provides a description of the Cline Center Coup D’état Project Dataset. The first section of this codebook provides a succinct definition of a coup d’état used by the CDP and an overview of the categories used to differentiate the wide array of events that meet the CDP definition. It also defines coup outcomes. The second section describes the methodology used to produce the data. <i>Created November 2020. Revised February 2021 to add some additional information about how the Cline Center edited some values in the COW country codes."</i> </li></ul> 2. <i>Coup_Data_v2.0.0.csv</i> <ul><li>This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup D’etat Project. It contains 29 variables and 943 observations. <i>Created November 2020</i></li></ul> 3. <i>Source Document v2.0.0.pdf</i> <ul><li>This 305-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify each particular event. <i>Created November 2020</i> </li></ul> 4. <i>README.md</i> <ul><li>This file contains useful information for the user about the dataset. It is a text file written in mark down language. <i>Created November 2020</i> </li></ul> <br> <b> Citation Guidelines</b> 1) To cite this codebook please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, and Jonathan Bonaguro. 2021. “Cline Center Coup D’état Project Dataset Codebook”. Cline Center Coup D’état Project Dataset. Cline Center for Advanced Social Research. V.2.0.2. February 23. University of Illinois Urbana-Champaign. doi: <a href="https://doi.org/10.13012/B2IDB-9651987_V2">10.13012/B2IDB-9651987_V3</a> 2) To cite the data please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, and Jonathan Bonaguro. 2020. Cline Center Coup D’état Project Dataset. Cline Center for Advanced Social Research. V.2.0.0. November 16. University of Illinois Urbana-Champaign. doi: <a href="https://doi.org/10.13012/B2IDB-9651987_V2">10.13012/B2IDB-9651987_V3</a>
keywords: Coup d'état; event data; Cline Center; Cline Center for Advanced Social Research; political science
published: 2020-12-16

Terrorism is among the most pressing challenges to democratic governance around the world. The Responsible Terrorism Coverage (or ResTeCo) project aims to address a fundamental dilemma facing 21st century societies: how to give citizens the information they need without giving terrorists the kind of attention they want. The ResTeCo hopes to inform best practices by using extreme-scale text analytic methods to extract information from more than 70 years of terrorism-related media coverage from around the world and across 5 languages. Our goal is to expand the available data on media responses to terrorism and enable the development of empirically-validated models for socially responsible, effective news organizations. This particular dataset contains information extracted from terrorism-related stories in the Foreign Broadcast Information Service (FBIS) published between 1995 and 2013. It includes variables that measure the relative share of terrorism-related topics, the valence and intensity of emotional language, as well as the people, places, and organizations mentioned. This dataset contains 3 files: 1. "ResTeCo Project FBIS Dataset Variable Descriptions.pdf" A detailed codebook containing a summary of the Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset and descriptions of all variables. 2. "resteco-fbis.csv" This file contains the data extracted from terrorism-related media coverage in the Foreign Broadcast Information Service (FBIS) between 1995 and 2013. It includes variables that measure the relative share of topics, sentiment, and emotion present in this coverage. There are also variables that contain metadata and list the people, places, and organizations mentioned in these articles. There are 53 variables and 750,971 observations. The variable "id" uniquely identifies each observation. Each observation represents a single news article. Please note that care should be taken when using "resteco-fbis.csv". The file may not be suitable to use in a spreadsheet program like Excel as some of the values get to be quite large. Excel cannot handle some of these large values, which may cause the data to appear corrupted within the software. It is encouraged that a user of this data use a statistical package such as Stata, R, or Python to ensure the structure and quality of the data remains preserved. 3. "README.md" This file contains useful information for the user about the dataset. It is a text file written in mark down language Citation Guidelines 1) To cite this codebook please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset Variable Descriptions. Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset. Cline Center for Advanced Social Research. December 16. University of Illinois Urbana-Champaign. doi: https://doi.org/10.13012/B2IDB-6360821_V1 2) To cite the data please use the following citation: Althaus, Scott, Joseph Bajjalieh, Marc Jungblut, Dan Shalmon, Subhankar Ghosh, and Pradnyesh Joshi. 2020. Responsible Terrorism Coverage (ResTeCo) Project Foreign Broadcast Information Service (FBIS) Dataset. Cline Center for Advanced Social Research. December 16. University of Illinois Urbana-Champaign. doi: https://doi.org/10.13012/B2IDB-6360821_V1
keywords: Terrorism, Text Analytics, News Coverage, Topic Modeling, Sentiment Analysis
published: 2020-12-16

The Cline Center Global News Index is a searchable database of textual features extracted from millions of news stories, specifically designed to provide comprehensive coverage of events around the world. In addition to searching documents for keywords, users can query metadata and features such as named entities extracted using Natural Language Processing (NLP) methods and variables that measure sentiment and emotional valence. Archer is a web application purpose-built by the Cline Center to enable researchers to access data from the Global News Index. Archer provides a user-friendly interface for querying the Global News Index (with the back-end indexing still handled by Solr). By default, queries are built using icons and drop-down menus. More technically-savvy users can use Lucene/Solr query syntax via a ‘raw query’ option. Archer allows users to save and iterate on their queries, and to visualize faceted query results, which can be helpful for users as they refine their queries. <b>Additional Resources:</b> - Access to Archer and the Global News Index is limited to account-holders. If you are interested in signing up for an account, you can fill out the <a href="https://docs.google.com/forms/d/1Vx_PpkIV4U1mt2FrPlfmdTC19VSidWQ8OC3D0lLNnvs/edit"><b>Archer User Information Form</b></a>. - Current users who would like to provide feedback, such as reporting a bug or requesting a feature, can fill out the <a href="https://forms.gle/6eA2yJUGFMtj5swY7"><b>Archer User Feedback Form</b></a>. - The Cline Center sends out periodic email newsletters to the Archer Users Group. Please fill out this form to <a href="https://groups.webservices.illinois.edu/subscribe/123172"><b>subscribe to Archer Users Group</b></a>. <b>Citation Guidelines:</b> 1) To cite the GNI codebook (or any other documentation associated with the Global News Index and Archer) please use the following citation: Cline Center for Advanced Social Research. 2020. Global News Index and Extracted Features Repository [codebook], v1.0.1. Champaign, IL: University of Illinois. Dec. 16. doi:10.13012/B2IDB-5649852_V2 2) To cite data from the Global News Index (accessed via Archer or otherwise) please use the following citation (filling in the correct date of access): Cline Center for Advanced Social Research. 2020. Global News Index and Extracted Features Repository [database], v1.0.1. Champaign, IL: University of Illinois. Dec. 16. Accessed Month, DD, YYYY. doi:10.13012/B2IDB-5649852_V2
keywords: Cline Center; Cline Center for Advanced Social Research; political; social; political science; Global News Index; Archer; news; mass communication; journalism;
published: 2021-06-14

Chronic contact exposure to realistic soil concentrations (0, 7.5, 15, and 100 ppb) of the neonicotinoid pesticide imidacloprid had species- and sex-specific effects on adult bee movement characteristics, but not on adult female bee brain development. This dataset contains two data files. The first contains information about adult bee movement characteristics for female Osmia lignaria and female and male Megachile rotundata over a 10-minute trial (total distance traveled and average movement speed). The second contains information about female Osmia lignaria and Megachile rotundata adult brain morphology. Detected effects included: female Osmia lignaria adults moved faster as they aged in the 0 and 7.5 ppb, but not in the 15 or 100 ppb, groups; young male Megachile rotundata adults moved more quickly (7.5 and 100 ppb) and farther (100 ppb) when treated with imidacloprid compared to the control group (0 ppb); and, while there was no impact of imidacloprid on adult female neuropil:Kenyon cell volume (N:K), N:K decreased with Osmia ligaria adult age and increased with Megachile rotundata adult age.
keywords: neonicotinoid; imidacloprid; bee; movement
published: 2021-02-28

This dataset contains the RegCM4 simulations used in the article " Implementation of dynamic ageing of carbonaceous aerosols in regional climate model RegCM". This dataset was used to investigate the impact of a new aging parameterisation scheme implemented in a regional climate model RegCM4. The dataset contains two sets of simulations: Expt_fix and Expt_dyn. It consists of the seasonal mean and daily mean values of the variables that were used to create the visualizations of this study. The Expt_fix and Expt_dyn dataset contain 34 and 38 NetCDF files, respectively. The CERES_vs_2expts_new.mat file is the comparison between CERES shortwave downward flux at the surface and same model outputs from two experiments for clear sky and all sky conditions. -------------------------------------------------- The following information about the dataset was generated on 2021-01-08 by SUDIPTA GHOSH <b>GENERAL INFORMATION</b> <i>1. Date of data collection (single date, range, approximate date):</i> 2019-01-01 to 2019-12-31 <i>2. Geographic location of data collection:</i> Urbana-Champaign,Illinois, USA <i>3. Information about funding sources that supported the collection of the data:</i> This work is supported by the MoEFCC under the NCAP-COALESCE project [Grant No. 14/10/2014-CC]. The first author acknowledges DST-INSPIRE fellowship [IF150055] and Fulbright-Kalam Climate Doctoral fellowship. N. R. acknowledges funding from NSF AGS-1254428 and DOE grant DE-SC0019192. Department of Science and Technology, Funds for Improvement of Science and Technology infrastructure in universities and higher educational institutions (DST-FIST) grant (SR/FST/ESII-016/2014) are acknowledged for the computing support. <b>DATA & FILE OVERVIEW</b> <i>1. File List:</i> Expt_fix and Expt_dyn datasets contain the analysed seasonal means and daily means of the variables that have been used to create the visualizations of this study. Each of the Expt_fix and Expt_dyn datasets contains 34 and 38 NetCDF files, respectively. <i>2. Relationship between files, if important:</i> NA <i>3. Additional related data collected that was not included in the current data package:</i> No <b>METHODOLOGICAL INFORMATION</b> <i>1. Description of methods used for collection/generation of data: </i> The model RegCM4 code is freely available online from <a href="http://gforge.ictp.it/gf/project/regcm/">http://gforge.ictp.it/gf/project/regcm/</a>. The anthropogenic aerosol emissions considered for the simulations are taken from IIASA inventory. The data used can be easily accessed online <a href="http://clima-dods.ictp.it/regcm4/">http://clima-dods.ictp.it/regcm4/</a> website. TRMM observed precipitation data can be assessed from <a href="https://giovanni.gsfc.nasa.gov/giovanni/">https://giovanni.gsfc.nasa.gov/giovanni/</a> website. CRU temperature data is available at <a href="https://crudata.uea.ac.uk/cru/data/hrg/">https://crudata.uea.ac.uk/cru/data/hrg/</a>. CERES satellite surface shortwave downward fluxes are available at <a href="https://ceres.larc.nasa.gov/data/">https://ceres.larc.nasa.gov/data/</a> website. Input files for the RegCM4 model are archived in <a href="http://clima-dods.ictp.it/regcm4/">http://clima-dods.ictp.it/regcm4/</a> website. This dataset contains the RegCM4 simulations used in the article " Implementation of dynamic ageing of carbonaceous aerosols in regional climate model RegCM ". Two sets of simulations: Expt_fix and Expt_dyn consists of the output data . This dataset only contains the analysed seasonal mean and daily mean of the variables that have been used to create the visualizations of this study. Each of Expt_fix and Expt_dyn contains 34 and 38 NetCDF files respectively. This dataset was used to investigate the impact of a new aging parameterisation scheme implemented in a regional climate model RegCM4. <i>2. Methods for processing the data:</i> Seasonal Mean and daily average values were extracted from 6-hourly model output. <i>3. Instrument- or software-specific information needed to interpret the data:</i> CDO-1.7.1, Grads-2.0.a9, Matlab2016b <i>4. Standards and calibration information, if appropriate:</i> NA <i>5. Environmental/experimental conditions:</i> NA <i>6. Describe any quality-assurance procedures performed on the data:</i> NA <i>7. People involved with sample collection, processing, analysis and/or submission:</i> Sudipta Ghosh, Nicole Riemer, Graziano Giuliani, Filippo Giorgi, Dilip Ganguly, Sagnik Dey <b>DATA-SPECIFIC INFORMATION FOR: Expt_fix_data.tar.gz</b> <i>1. Number of variables:</i> 29 <i>2. Number of cases/rows:</i> NA <i>3. Variable List:</i> Mass concentration (Kg m-3) of BC, BC_HB, BC_HL, OC, OC_HB, OC_HL; Columnar burden (mg m-2)] of BC, BC_HL, BC_HB, OC; Dry deposition flux (mg m-2 day-1) of BC_HB, BC_HL, OC_HB, OC_HL; Wet deposition flux due washout (mg m-2 day-1) of BC_HB, BC_HL, OC_HB, OC_HL; Wet deposition flux due to rainout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; AOD (unit less), precipitation (Kg m-2 s-1), temperature (K) , v-wind (m s-1), u-wind (m s-1), Surface shortwave downward flux (W m-2), Shortwave radiative forcing at the surface and top of atmosphere (W m-2) <b>DATA-SPECIFIC INFORMATION FOR: Expt_dyn_data.tar.gz</b> <i>1. Number of variables:</i> 30 <i>2. Number of cases/rows:</i> NA <i>3. Variable List:</i> Mass concentration (Kg m-3) of BC, BC_HB, BC_HL, OC, OC_HB, OC_HL; Columnar burden (mg m-2)] of BC, BC_HL, BC_HB, OC; Dry deposition flux (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; Wet deposition flux due washout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; Wet deposition flux due to rainout (mg m-2 day-1) of BC_HB, BC_HL OC_HB, OC_HL; AOD (unit less); precipitation (Kg m-2 s-1); temperature (K); v-wind (m s-1); u-wind (m s-1); Surface shortwave downward flux (W m-2); Shortwave radiative forcing at the surface and top of atmosphere (W m-2); ageingscale (s-1) <b>DATA-SPECIFIC INFORMATION FOR: CERES_vs_2expts_new.mat</b> <i>1. Number of variables:</i> 12 <i>2. Number of cases/rows:</i> NA <i>3. Variable List:</i> Surface shortwave downward flux for clear sky (W/m-2) for CERES, Expt_fix, Expt_dyn (for winter JF and monsoon JJAS seasons); Surface shortwave downward flux for all sky conditions (W/m-2) for CERES, Expt_fix, Expt_dyn (for winter JF and monsoon JJAS seasons). <b>NOTE:</b> The following information applies for all three (3) files: <i> Missing data codes:</i> NA <i>Specialized formats or other abbreviations used:</i> NA
keywords: Carbonaceous aerosols; ageing parameterisation scheme; regional climate model; NetCDF
published: 2021-06-08

Dataset associated with Jones and Ward JAE-2020-0031.R1 submission: Pre-to post-fledging carryover effects and the adaptive significance of variation in wing development for juvenile songbirds. Excel CSV files with data used in analyses and file with descriptions of each column. The flight ability variable in this dataset was derived from fledgling drop tests, examples of which can be found in the related dataset: Jones, Todd M.; Benson, Thomas J.; Ward, Michael P. (2019): Flight Ability of Juvenile Songbirds at Fledgling: Examples of Fledgling Drop Tests. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2044905_V1.
keywords: fledgling; wing development; life history; adaptive significance; post-fledging; songbirds
published: 2021-05-21

Data sets from "Inferring Species Trees from Gene-Family with Duplication and Loss using Multi-Copy Gene-Family Tree Decomposition." It contains trees and sequences simulated with gene duplication and loss under a variety of different conditions. <b>Note:</b> - trees.tar.gz contains the simulated gene-family trees used in our experiments (both true trees from SimPhy as well as trees estimated from alignements). - sequences.tar.gz contains simulated sequence data used for estimating the gene-family trees as well as the concatenation analysis. - biological.tar.gz contains the gene trees used as inputs for the experiments we ran on empirical data sets as well as species trees outputted by the methods we tested on those data sets. - stats.txt list statistics (such as AD, MGTE, and average size) for our simulated model conditions.
keywords: gene duplication and loss; species-tree inference; simulated data;
published: 2021-05-26

Steady-state and dynamic gas exchange data for maize (B73), sugarcane (CP88-1762) and sorghum (Tx430)
keywords: C4 plants; gas exchange
published: 2021-05-14

This is the complete dataset for the "Anomalous density fluctuations in a strange metal" Proceedings of the National Academy of Sciences publication (https://doi.org/10.1073/pnas.1721495115). This is an integration of the Zenodo dataset which includes raw M-EELS data. <b>METHODOLOGICAL INFORMATION</b> 1. Description of methods used for collection/generation of data: Data have been collected with a M-EELS instrument and according to the data acquisition protocol described in the original PNAS publication and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026) 2. Methods for processing the data: Raw data were collected with a channeltron-based M-EELS apparatus described in the reference PNAS publication and analyzed according to the procedure outlined both in the PNAS paper and in SciPost Phys. 3, 026 (2017) (doi: 10.21468/SciPostPhys.3.4.026). The raw M-EELS spectra at each momentum have been subject to minor data processing involving: (a) averaging of different acquisitions at the same conditions, (b) energy binning, (c) division of an effective Coulomb matrix element (which yields a structure factor S(q,\omega)), (d) antisymmetrization (which yields the imaginary chi) All these procedures are described in the PNAS paper. 3. Instrument- or software-specific information needed to interpret the data: These data are simple .txt or .dat files which can be read with any standard data analysis software, notably Python notebooks, MatLab, Origin, IgorPro, and others. We do not include scripts in order to provide maximum flexibility. 4. Relationship between files, if important: We divided in different folders raw data, structure factors and imaginary chi. <b>DATA-SPECIFIC INFORMATION</b> There are 8 folders within the Data_public_deposition_v1.zip. Each folder contain data needed to create the corresponding figure in the publication. <b>1. Fig1:</b> This folder contains 21 DAT files needed to plot the theory data in panels C and D, following this naming conventions: [chiA]or[chiB]or[Pi]_q_number.dat With chiA is the imaginary RPA charge susceptibility with a Coulomb interaction of electronically weakly coupled layers chiB is the imaginary RPA charge susceptibility with the usual 4\pi e^2/q^2 Coulomb interaction. Pi is the imaginary Lindhard polarizability. q is momentum in reciprocal lattice units Number is the numerical momentum value in reciprocal lattice units <b>2. Fig2:</b> Files needed to plot Fig. 2 of the PNAS paper. Contains 3 folders as listed below. The files in this folder are named following this convention: Bi2212_295K_(1,-1)_50eV_161107_q_number_2.16_avg.dat, 295K is the sample temperature (1,-1) is the momentum direction in reciprocal lattice units 50 eV is the incident e beam energy 161107 is the start date of the experiment in yymmdd format Q is the momentum Number is the momentum in reciprocal lattice units 2.16 is the energy range covered by the data in eV Avg identifies averaged data ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>3. Fig3:</b> Files needed to plot Fig. 3 of the PNAS paper. OP/ OD prefix identifies optimally doped or overdosed sample data, respectively. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>4. Fig4:</b> Files needed to plot Fig. 4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>5. FigS1:</b> Files needed to plot Fig. S1 of the PNAS paper. There are 5 files in this folder. DAT files are M-EELS data following the prior naming convention, while the two .txt files are digitized data from N. Nücker, U. Eckern, J. Fink, and P. Müller, Long-Wavelength Collective Excitations of Charge Carriers in High-Tc Superconductors, Phys. Rev. B 44, 7155(R) (1991), and K. H. G. Schulte, The interplay of Spectroscopy and Correlated Materials, Ph.D. thesis, University of Groningen (2002). <b>6. FigS2:</b> Files needed to plot Fig. S2 of the PNAS paper. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra <b>7. FigS3:</b> Files needed to plot Fig. S3 of the PNAS paper. There are 2 files in this folder: 20K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 20 K 295K_phi_0_q_0.dat: is a M-EELS raw intensity at zero momentum transfer on Bi2212 at 295 K <b>8. FigS4:</b> Files needed to plot Fig. S4 of the PNAS paper. The _fit_parameters.dat file contains the fit parameters extracted according to the fit procedure described in the manuscript and at all momenta. ImChi: is the imaginary susceptibility obtained by antisymmetryzing the structure factor Raw_avg_data: raw averaged M-EELS spectra Sqw: Structure factors derived from the M-EELS spectra
keywords: Momentum resolved electron energy loss spectroscopy (M-EELS); cuprates; plasmons; strange metal
published: 2021-05-14

- The aim of this research was to evaluate the novel dietary fiber source, miscanthus grass, in comparison to traditional fiber sources, and their effects on the microbiota of healthy adult cats. Four dietary treatments, cellulose (CO), miscanthus grass fiber (MF), a blend of miscanthus fiber and tomato pomace (MF+TP), or beet pulp (BP) were evaluated.<br /><br />- The study was conducted using a completely randomized design with twenty-eight neutered adult, domesticated shorthair cats (19 females and 9 males, mean age 2.2 ± 0.03 yr; mean body weight 4.6 ± 0.7 kg, mean body condition score 5.6 ± 0.6). Total DNA from fresh fecal samples was extracted using Mo-Bio PowerSoil kits (MO BIO Laboratories, Inc., Carlsbad, CA). Amplification of the 292 bp-fragment of V4 region from the 16S rRNA gene was completed using a Fluidigm Access Array (Fluidigm Corporation, South San Francisco, CA). Paired-end Illumina sequencing was performed on a MiSeq using v3 reagents (Illumina Inc., San Diego, CA) at the Roy J. Carver Biotechnology Center at the University of Illinois. <br />- Filenames are composed of animal name identifier, diet (BP= beet pulp; CO= cellulose; MF= miscanthus grass fiber; TP= blend of miscanthus fiber and tomato pomace).
keywords: cats; dietary fiber; fecal microbiota; miscanthus grass; nutrient digestibility; postbiotics
published: 2021-05-13

Data files and R code to replicate the econometric analysis in the journal article: B Chen, BM Gramig and SD Yun. “Conservation Tillage Mitigates Drought Induced Soybean Yield Losses in the US Corn Belt.” Q Open. https://doi.org/10.1093/qopen/qoab007
keywords: R, Conservation Tillage, Drought, Yield, Corn, Soybeans, Resilience, Climate Change
published: 2020-10-28

We studied we examined the role of stream flow on environmental DNA (eDNA) concentrations and detectability of an invasive clam (Corbicula fluminea), while also accounting for other abiotic and biotic variables. This data includes the eDNA concentrations, quadrat estimates of clam density, and abiotic variables.
keywords: Corbicula; detection probability; eDNA; invasive species; lotic; occupancy modeling
published: 2021-05-07

Prepared by Vetle Torvik 2021-05-07 The dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters). • How was the dataset created? The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in December, 2018. (NLMs baseline 2018 plus updates throughout 2018). Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only. However, MapAffil 2018 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220). Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records. All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in: Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p • Look for Fig. 4 in the following article for coverage statistics over time: Palmblad, M., Torvik, V.I. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Trop Med Health 45, 33 (2017). <a href="https://doi.org/10.1186/s41182-017-0073-6">https://doi.org/10.1186/s41182-017-0073-6</a> Expect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014. • The code and back-end data is periodically updated and made available for query by PMID at http://abel.ischool.illinois.edu/cgi-bin/mapaffil/search.py • What is the format of the dataset? The dataset contains 52,931,957 rows (plus a header row). Each row (line) in the file has a unique PMID and author order, and contains the following eighteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1. 1. PMID: positive non-zero integer; int(10) unsigned 2. au_order: positive non-zero integer; smallint(4) 3. lastname: varchar(80) 4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed 5. initial_2: middle name initial 6. orcid: From 2019 ORCID Public Data File https://orcid.org/ and from PubMed XML 7. year: year of the publication 8. journal: name of journal that the publication is published 9. affiliation: author's affiliation?? 10. disciplines: extracted from departments, divisions, schools, laboratories, centers, etc. that occur on at least unique 100 affiliations across the dataset, some with standardization (e.g., 1770799), English translations (e.g., 2314876), or spelling corrections (e.g., 1291843) 11. grid: inferred using a high-recall technique focused on educational institutions (but, for experimental purposes, includes a few select hospitals, national institutes/centers, international companies, governmental agencies, and 200+ other IDs [RINGGOLD, Wikidata, ISNI, VIAF, http] for institutions not in GRID). Based on 2019 GRID version https://www.grid.ac/ 12. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK 13. city: varchar(200); typically 'city, state, country' but could include further subdivisions; unresolved ambiguities are concatenated by '|' 14. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA) 15. country 16. lat: at most 3 decimals (only available when city is not a country or state) 17. lon: at most 3 decimals (only available when city is not a country or state) 18. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find
keywords: PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution; institution name disambiguation
published: 2021-05-12

These are the data sets associated with our publication "Field borders provide winter refuge for beneficial predators and parasitoids: a case study on organic farms." For this project, we compared the communities of overwintering arthropod natural enemies in organic cultivated fields and wildflower-strip field borders at five different sites in central Illinois. Abstract: Semi-natural field borders are frequently used in midwestern U.S. sustainable agriculture. These habitats are meant to help diversify otherwise monocultural landscapes and provision them with ecosystem services, including biological control. Predatory and parasitic arthropods (i.e., potential natural enemies) often flourish in these habitats and may move into crops to help control pests. However, detailed information on the capacity of semi-natural field borders for providing overwintering refuge for these arthropods is poorly understood. In this study, we used soil emergence tents to characterize potential natural enemy communities (i.e., predacious beetles, wasps, spiders, and other arthropods) overwintering in cultivated organic crop fields and adjacent field borders. We found a greater abundance, species richness, and unique community composition of predatory and parasitic arthropods in field borders compared to arable crop fields, which were generally poorly suited as overwintering habitat. Furthermore, potential natural enemies tended to be positively associated with forb cover and negatively associated with grass cover, suggesting that grassy field borders with less forb cover are less well-suited as winter refugia. These results demonstrate that semi-natural habitats like field borders may act as a source for many natural enemies on a year-to-year basis and are important for conserving arthropod diversity in agricultural landscapes.
keywords: Natural enemy; wildflower strips; conservation biological control; semi-natural habitat; field border; organic farming
published: 2021-05-10

This dataset contains the emulated global multi-model urban daily temperature projections under RCP 8.5 scenario. The dataset is derived from the study "Large model structural uncertainty in global projections of urban heat waves" (XXXX). Details about this dataset and the local urban climate emulator are described in the article. This dataset documents the global urban daily temperatures of 17 CMIP5 Earth system models for 2006-2015 and 2061-2070. This dataset may be useful for multiple communities regarding urban climate change, heat waves, impacts, vulnerability, risks, and adaptation applications.
keywords: Urban heat waves; CMIP; urban warming; heat stress; urban climate change
published: 2021-05-10

UAV-based high-resolution multispectral time-series orthophotos utilized to understand the relation between growth dynamics, imagery temporal resolution, and end-of-season biomass productivity of biomass sorghum as bioenergy crop. Sensor utilized is a RedEdge Micasense flown at 40 meters above ground level at the Energy Farm- UIUC in 2019.
keywords: Unmanned aerial vehicles; High throughput phenotyping; Machine learning; Bioenergy crops
published: 2021-05-10

This dataset contains data used in publication "Institutional Data Repository Development, a Moving Target" submitted to Code4Lib Journal. It is a tabular data file describing attributes of data files in datasets published in Illinois Data Bank 2016-04-01 to 2021-04-01.