Illinois Data Bank Dataset Search Results
Results
published:
2020-06-03
Zachwieja, Alexandra
(2020)
This dataset provides files for use in analysis of human land preference across Australasia, and in a localized analysis of land preference in Laos and Vietnam. All files can be imported into ArcGIS for visualization, and re-analyzed using the open source Maxent species distribution modeling program. CSV files contain known human presence sites for model validation. ASC files contain geographically coded environmental data for mean annual temperature and mean annual precipitation during the Last Glacial Maximum, as well as downward slope data. All ASC files are in the WGS 1984 Mercator map projection for visualization in ArcGIS and can be opened as text files in text editors supporting large file sizes.
keywords:
human dispersal; ecological niche modeling; Australasia; Late Pleistocene; land preference
published:
2023-09-01
Chakraborty, Sulagna; Steckler, Teresa; Gronemeyer, Peg; Mateus-Pinilla, Nohra; Smith, Rebecca
(2023)
An online and paper knowledge, attitudes, and practices survey on ticks and tick-borne diseases (TBD) was distributed to farmers in Illinois during summer 2020 to spring 2022 (paper version titled Final Draft Farmer KAP_v.SoftCopy_Revised.docx). These are the raw data associated with that survey and the survey questions used (FarmerTickKAPdata.csv, data dictionary in Data Description.docx). We have added calculated values (columns 286 to end, code for calculation in FarmerKAPvariableCalculation.R), including: the tick knowledge score, TBD knowledge score, and total knowledge score, which are the sum of the total number of correct answers in each category, and score percent, which are the proportion of correct answers in each category.
keywords:
ticks; survey; tick-borne disease; farmer
published:
2021-06-17
Dominguez, Francina; Yang, Zhao
(2021)
Model output dataset (6-hourly) from the Weather Research and Forecasting (WRF) model simulations over South America with the added capability of water vapor tracers to track the moisture that originates over the Amazon and the La Plata river basins. The simulations were performed for the period 2003-2013 at 20-km horizontal resolution fully coupled with the Noah-MP land surface model. Limited number of original output variables sufficient for reproducing the analyses in papers that cite this dataset are included here. The attached wrfout_southamerica_readme.txt contains detailed information about the file format and variables. For the complete model dataset, contact francina@illinois.edu.
keywords:
WRF; Amazon; La Plata; South America; Numerical tracers
published:
2019-07-08
Kehoe, Adam K.; Torvik, Vetle I.
(2019)
# Overview
These datasets were created in conjunction with the dissertation "Predicting Controlled Vocabulary Based on Text and Citations: Case Studies in Medical Subject Headings in MEDLINE and Patents," by Adam Kehoe.
The datasets consist of the following:
* twin_not_abstract_matched_complete.tsv: a tab-delimited file consisting of pairs of MEDLINE articles with identical titles, authors and years of publication. This file contains the PMIDs of the duplicate publications, as well as their medical subject headings (MeSH) and three measures of their indexing consistency.
* twin_abstract_matched_complete.tsv: the same as above, except that the MEDLINE articles also have matching abstracts.
* mesh_training_data.csv: a comma-separated file containing the training data for the model discussed in the dissertation.
* mesh_scores.tsv: a tab-delimited file containing a pairwise similarity score based on word embeddings, and MeSH hierarchy relationship.
## Duplicate MEDLINE Publications
Both the twin_not_abstract_matched_complete.tsv and twin_abstract_matched_complete.tsv have the same structure. They have the following columns:
1. pmid_one: the PubMed unique identifier of the first paper
2. pmid_two: the PubMed unique identifier of the second paper
3. mesh_one: A list of medical subject headings (MeSH) from the first paper, delimited by the "|" character
4. mesh_two: a list of medical subject headings from the second paper, delimited by the "|" character
5. hoopers_consistency: The calculation of Hooper's consistency between the MeSH of the first and second paper
6. nonhierarchicalfree: a word embedding based consistency score described in the dissertation
7. hierarchicalfree: a word embedding based consistency score additionally limited by the MeSH hierarchy, described in the dissertation.
## MeSH Training Data
The mesh_training_data.csv file contains the training data for the model discussed in the dissertation. It has the following columns:
1. pmid: the PubMed unique identifier of the paper
2. term: a candidate MeSH term
3. cit_count: the log of the frequency of the term in the citation candidate set
4. total_cit: the log of the total number the paper's citations
5. citr_count: the log of the frequency of the term in the citations of the paper's citations
6. total_citofcit: the log of the total number of the citations of the paper's citations
7. absim_count: the log of the frequency of the term in the AbSim candidate set
8. total_absim_count: the log of the total number of AbSim records for the paper
9. absimr_count: the log of the frequency of the term in the citations of the AbSim records
10. total_absimr_count: the log of the total number of citations of the AbSim record
11. log_medline_frequency: the log of the frequency of the candidate term in MEDLINE.
12. relevance: a binary indicator (True/False) if the candidate term was assigned to the target paper
## Cosine Similarity
The mesh_scores.tsv file contains a pairwise list of all MeSH terms including their cosine similarity based on the word embedding described in the dissertation. Because the MeSH hierarchy is also used in many of the evaluation measures, the relationship of the term pair is also included. It has the following columns:
1. mesh_one: a string of the first MeSH heading.
2. mesh_two: a string of the second MeSH heading.
3. cosine_similarity: the cosine similarity between the terms
4. relationship_type: a string identifying the relationship type, consisting of none, parent/child, sibling, ancestor and direct (terms are identical, i.e. a direct hierarchy match).
The mesh_model.bin file contains a binary word2vec C format file containing the MeSH term embeddings. It was generated using version 3.7.2 of the Python gensim library (https://radimrehurek.com/gensim/).
For an example of how to load the model file, see https://radimrehurek.com/gensim/models/word2vec.html#usage-examples, specifically the directions for loading the "word2vec C format."
keywords:
MEDLINE;MeSH;Medical Subject Headings;Indexing
published:
2023-07-10
Harmon-Threatt, Alexandra N.; Anderson, Nicholas L.
(2023)
Bee movement between habitat patches in a naturally fragmented ecosystem depended on species, patch, and matrix variables. Using a mark-recapture methodology in the naturally fragmented Ozark glade ecosystem, we assessed the importance of bee size, nesting biology, the distance between patches (e.g., isolation), and nesting and floral resources in habitat patches and the surrounding matrix on bee movement.
This dataset includes seven data files, three R code files, and a QGIS tool. Three of the data files include information collected at the study sites with regard to bees and matrix and patch characteristics. The other four data files are spatial files used to quantify the characteristics of the forest canopy between the study sites and the edge-to-edge distances between the study sites. R code in the R Markdown file recreates the analysis and data presentation for the associated publication. R script files contain processes for calculating some of the explanatory variables used in the analysis. The QGIS tool can be used as the first step to obtaining average values from a raster file where the cells are large relative to the areas of interest (AOI) that you would like to characterize. The second step is contained in one of the aforementioned R scripts.
Detected effects included: Larger bees were more likely to move between patches. Bee movement was less likely as the distance between patches increased. However, relatively short distances (~50 m) inhibited movement more than our a priori expectations. Bees were unlikely to move away from home patches with abundant and diverse floral and below-ground nesting resources. When home patches were less resource-rich, bee movement depended on the characteristics of the away patch or the matrix. In these cases, bees were more likely to move to away patches with greater below-ground nesting and floral resources. Matrix habitats with more available floral and below-ground nesting resources appear to impede movement to neighboring patches, potentially because they already provide supplemental resources for bees.
keywords:
habitat fragmentation; bees; movement; mark-recapture; nesting resources; floral resources; isolation
published:
2024-07-08
Chong, Jer Pin; Minnaert-Grote, Jamie; Zaya, David N.; Ashley, Mary V.; Coons, Janice; Ramp Neal, Jennifer M.; Molano-Flores, Brenda
(2024)
A population genetics study was conducted on three plant taxa in the genus Physaria that are found on the Kaibab Plateau (Arizona, USA). Physaria kingii subsp. kaibabensis is endemic to the Kaibab Plateau, and is of conservation concern because of its rarity, limited range, and potential threats to its long-term persistence. Additionally, the taxon is a candidate for federal protection under the Endangered Species Act. It was not clear how genetically isolated P. k. subsp. kaibabensis was from Physaria kingii subsp. latifolia, which is a widespread subspecies found throughout the southwestern USA, including on the Kaibab Plateau. Additionally, other authors have suggested that P. k. subsp. kaibabensis may hybridize with Physaria arizonica, a different species that is also widespread and found on and off the Kaibab Plateau. We conducted a population genetics study of all three groups to better determine the conservation status of P. k. subsp. kaibabensis. Genetic data are in the form of nuclear DNA microsatellites for 13 loci (all apparently diploid). Additionally, we have included location information for the collection sites. We collected tissue samples from on and off the Kaibab Plateau. The overall findings are shared in a manuscript being submitted for peer-review.
keywords:
Physaria kingii; Kaibab Plateau; endemism; conservation genetics; rare species biology
published:
2018-03-01
Chiavacci, Scott J.; Benson, Thomas J.; Ward, Michael P.
(2018)
Data were used to analyze patterns in predator-specific nest predation on shrubland birds in Illinois as related to landscape composition at multiple landscape scales. Data were used in a Journal of Applied Ecology research paper of the same name. Data were collected between 2011 and 2014 at sites in east-central and northeastern Illinois, USA as part of a Ph.D. research project on the relationship between avian nest predation and landscape characteristics, and how nest predation affects adult and nestling bird behavior.
keywords:
nest predation; avian ecology; land cover; landscape composition; landscape scale; nest camera; nest survival; predator-specific mortality; scale-dependence; scrubland; shrub-nesting bird
published:
2019-08-29
This is the published ortholog set derived from whole genome data used for the analysis of members of the B. tabaci complex of whiteflies. It includes the concatenated alignment and individual gene alignments used for analyses (Link to publication: https://www.mdpi.com/1424-2818/11/9/151).
published:
2024-10-12
Langeslay, Blake; Juarez, Gabriel
(2024)
Simulation data used to generate plots in the associated paper ("Strain rate controls alignment in growing bacterial monolayers").
published:
2025-10-08
Kim, Sang Yeol; Stessman, Dan J.; Wright, David A.; Spalding, Martin H.; Huber, Steven; Ort, Donald
(2025)
Rubisco activase (Rca) facilitates the release of sugar‐phosphate inhibitors from the active sites of Rubisco and thereby plays a central role in initiating and sustaining Rubisco activation. In Arabidopsis, alternative splicing of a single Rca gene results in two Rca isoforms, Rca‐α and Rca‐β. Redox modulation of Rca‐α regulates the function of Rca‐α and Rca‐β acting together to control Rubisco activation. Although Arabidopsis Rca‐α alone less effectively activates Rubisco in vitro , it is not known how CO2 assimilation and plant growth are impacted. Here, we show that two independent transgenic Arabidopsis lines expressing Rca‐α in the absence of Rca‐β (“Rca‐α only” lines) grew more slowly in various light conditions, especially under low light or fluctuating light intensity, and in a short day photoperiod compared to wildtype. Photosynthetic induction was slower in the Rca‐α only lines, and they maintained a lower rate of CO2 assimilation during both photoperiod types. Our findings suggest Rca oligomers composed of Rca‐α only are less effective in initiating and sustaining the activation of Rubisco than when Rca‐β is also present. Currently there are no examples of any plant species that naturally express Rca‐α only but numerous examples of species expressing Rca‐β only. That Rca‐α exists in most plant species, including many C3 and C4 food and bioenergy crops, implies its presence is adaptive under some circumstances.
keywords:
Feedstock Production;Biomass Analytics;Phenomics
published:
2025-10-24
Maitra, Shraddha; Singh, Vijay
(2025)
Sweet sorghum is typically cultivated for the food and fodder market. Recently, sweet sorghum varieties are being metabolically transitioned to enhance energy density by accumulating oil droplets in their vegetative tissues for bioenergy applications. Owing to the high biomass yield of sorghum, the transgenic lines can compete with oil-seed crops for biodiesel yield per unit area. In the initial phase of transgenic development, a high-throughput phenotyping method can bridge the gap between the production pipeline and analysis to improve the efficiency of the process. To meet the requirement, the present study extends the application of time-domain 1H-NMR spectroscopy for rapid quantification and characterization of the total in-situ lipids of sweet sorghum ‘ramada’ to lay the groundwork for analyzing the upcoming large quantity of transgenic samples. NMR technology has been successfully established for analyzing lipid contents of vegetative tissues of non-transgenic variety. The multiexponential analysis of spin-lattice (T1) relaxation spectra obtained from TD-NMR aided the investigation of the dynamics of the free and bound lipid fraction with plant development. The total lipid concentration of bagasse and leaves of non-transgenic sweet sorghum remained unchanged throughout the plant development. Leaves displayed a higher percentage of bound lipids as compared to bagasse. A significant variation in the lipid concentration of juice was observed at the different growth stages with a maximum lipid accumulation of 1.21 ± 0.04% w/w at the boot stage that decreased with further maturity of the plant.
keywords:
Conversion;Biomass Analytics;Lipidomics;Metabolomics
published:
2019-11-12
Rezapour, Rezvaneh
(2019)
We are sharing the tweet IDs of four social movements: #BlackLivesMatter, #WhiteLivesMatter, #AllLivesMatter, and #BlueLivesMatter movements. The tweets are collected between May 1st, 2015 and May 30, 2017. We eliminated the location to the United States and focused on extracting the original tweets, excluding the retweets.
Recommended citations for the data:
Rezapour, R. (2019). Data for: How do Moral Values Differ in Tweets on Social Movements?. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9614170_V1
and
Rezapour, R., Ferronato, P., and Diesner, J. (2019). How do moral values differ in tweets on social movements?. In 2019 Computer Supported Cooperative Work and Social Computing Companion Publication (CSCW’19 Companion), Austin, TX.
keywords:
Twitter; social movements; black lives matter; blue lives matter; all lives matter; white lives matter
published:
2020-10-01
Strickland, Lynette
(2020)
These datasets were performed to assess whether color pattern phenotypes of the polymorphic tortoise beetle, Chelymorpha alternans, mate randomly with one another, and whether there are any reproductive differences between assortative and disassortative pairings.
keywords:
mate choice, color polymorphisms, random mating
published:
2021-10-15
Atomic oxygen densities in the MLT, averaged for 2002-2018 for 26, 14 day periods, beginning January 1.
keywords:
SABER data
published:
2025-04-04
Fang, Liri; Salami, Malik Oyewale; Weber, Griffin M.; Torvik, Vetle I.
(2025)
This dataset, uCite, is the union of nine large-scale open-access PubMed citation data separated by reliability. There are 20 files, including the reliable and unreliable citation PMID pairs, non-PMID identifiers to PMID mapping (for DOIs, Lens, MAG, and Semantic Scholar), original PMID pairs from the nine resources, some metadata for PMIDs, duplicate PMIDs, some redirected PMID pairs, and PMC OA Patci citation matching results.
The short description of each data file is listed as follows. A detailed description can be found in the README.txt.
<strong>DATASET DESCRIPTION</strong>
<ol>
<li>PPUB.tsv.gz - tsv format file containing reliable citation pairs uCite.</li>
<li>PUNR.tsv.gz - tsv format file containing reliable citation pairs uCite.</li>
<li>DOI2PMID.tsv.gz - tsv format file containing results mapping DOI to PMID. </li>
<li> LEN2PMID.tsv.gz - tsv format file containing results mapping LensID pairs to PMID pairs.. </li>
<li> MAG2PMIDsorted.tsv.gz - tsv format file containing results mapping MAG ID to PMID. </li>
<li>SEM2PMID.tsv.gz - tsv ormat file containing results mapping Semantic Scholar ID to PMID. </li>
<li>JVNPYA.tsv.gz - tsv format file containing metadata of papers with PMID, journal name, volume, issue, pages, publication year, and first author's last name. </li>
<li>TiLTyAlJVNY.tsv.gz - tsv format file containing metadata of papers. </li>
<li> PMC-OA-patci.tsv.gz - tsv format file containing PubMed Central Open Access subset reference strings extracted by \cite{} processed by Patci.</li>
<li>REDIRECTS.gz - txt file containing unreliable PMID pairs mapped to reliable PMID pairs. </li>
<li>REMAP - file containing pairs of duplicate PubMed records (lhs PMID mapped to rhs PMID).</li>
<li> ami_pair.tsv.gz - tsv format file containing all citation pairs from Aminer (2015 version). </li>
<li> dim_pair.tsv.gz - tsv format file containing all citation pairs from Dimensions. </li>
<li> ice_pair.tsv.gz - tsv format file containing all citation pairs from iCite (April 2019 version, version 1). </li>
<li> len_pair.tsv.gz - tsv format file containing all citation pairs from Lens.org (harvested through Oct 2021). </li>
<li>mag_pair.tsv.gz - tsv format file containing all citation pairs from Microsoft Academic Graph (2015 version). </li>
<li> oci_pair.tsv.gz - tsv format file containing all citation pairs from Open Citations (Nov. 2021 dump, csv version ). </li>
<li> pat_pair.tsv.gz - tsv format file containing all citation pairs from Patci (i.e., from "PMC-OA-patci.tsv.gz"). </li>
<li> pmc_pair.tsv.gz - tsv format file containing all citation pairs from PubMed Central (harvest through Dec 2018 via e-Utilities).</li>
<li> sem_pair.tsv.gz - tsv format file containing all citation pairs from Semantic Scholar (2019 version) . </li>
</ol>
<strong>COLUMN DESCRIPTION</strong>
<strong>FILENAME</strong> : <em>PPUB.tsv.gz, PUNR.tsv.gz</em>
(1) fromPMID - PubMed ID of the citing paper.
(2) toPMID - PubMed ID of the cited paper.
(3) sources - citation sources, in which the citation pairs are identified.
(4) fromYEAR - Publication year of the citing paper.
(5) toYEAR - Publication year of the cited paper.
<strong>FILENAME</strong> : <em>DOI2PMID.tsv.gz</em>
(1) DOI - Semantic Scholar ID of paper records.
(2) PMID - PubMed ID of paper records.
(3) PMID2 - Digital Object Identifier of paper records, “-” if the paper doesn't have DOIs.
<strong>FILENAME</strong> : <em>SEMID2PMID.tsv.gz</em>
(1) SemID - Semantic Scholar ID of paper records.
(2) PMID - PubMed ID of paper records.
(3) DOI - Digital Object Identifier of paper records, “-” if the paper doesn't have DOIs.
<strong>FILENAME</strong> : <em>JVNPYA.tsv.gz</em>
- Each row refers to a publication record.
(1) PMID - PubMed ID.
(2) journal - Journal name.
(3) volume - Journal volume.
(4) issue - Journal issue.
(5) pages - The first page and last page (without leading digits) number of the publication separated by '-'.
(6) year - Publication year.
(7) lastname - Last name of the first author.
<strong>FILENAME</strong> : <em>TiLTyAlJVNY.tsv.gz</em>
(1) PMID - PubMed ID.
(2) title_tokenized - Paper title after tokenization.
(3) languages - Language that paper is written in.
(4) pub_types - Types of the publication.
(5) length(authors) - String length of author names.
(6) journal -Journal name .
(7) volume - Journal volume .
(8) issue - Journal issue.
(9) year - Publication year of print (not necessary epub).
<strong>FILENAME</strong> : <em> PMC-OA-patci.tsv.gz</em>
(1) pmcid - PubMed Central identifier.
(2) pos -
(3) fromPMID - PubMed ID of the citing paper.
(4) toPMID - PubMed ID of the cited paper.
(5) SRC - citation sources, in which the citation pairs are identified.
(6) MatchDB - PubMed, ADS, DBLP.
(7) Probability - Matching probability predicted by Patci.
(8) toPMID2 - PubMed ID of the cited paper, extracted from OA xml file
(9) SRC2 - citation sources, in which the citation pairs are identified.
(10) intxt_id -
(11) jounal - First character of the journal name.
(12) same_ref_string - Y if patci and xml reference string match, otherwise N.
(13) DIFF -
(14) bestSRC - Citation sources, in which the citation pairs are identified.
(15) Match - Matching strings annotated by Patci.
<strong>FILENAME</strong> : <em>REDIRECTS.gz</em>
Each row in Redirectis.txt is a string sequence in the same format as follows.
- "REDIRECTED FROM: source PMID_i PMID_j -> PMID_i' PMID_j "
- "REDIRECTED TO: source PMID_i PMID_j -> PMID_i PMID_j' "
Note: source is the names of sources where the PMID_i and PMID_j are from.
<strong>FILENAME</strong> : <em>REMAP</em>
Each row is remapping unreliable PMID pairs mapped to reliable PMID pairs.
The format of each row is "$REMAP{PMID_i} = PMID_j".
<strong>FILENAME</strong> : <em>ami_pair.tsv.gz, dim_pair.tsv.gz, ice_pair.tsv.gz, len_pair.tsv.gz, mag_pair.tsv.gz, oci_pair.tsv.gz, pat_pair.tsv.gz,pmc_pair.tsv.gz, sem_pair.tsv.gz</em>
(1) fromPMID - PubMed ID of the citing paper.
(2) toPMID - PubMed ID of the cited paper.
keywords:
Citation data; PubMed; Social Science;
published:
2018-06-18
Clark, Lindsay V.; Jin, Xiaoli; Petersen, Karen K.; Anzoua, Kossanou G.; Bagmet, Larissa; Chebukin, Pavel; Deuter, Martin; Dzyubenko, Elena; Dzyubenko, Nicolay; Heo, Kweon; Johnson, Douglas A.; Jørgensen, Uffe; Kjeldsen, Jens B.; Nagano, Hironori; Peng, Junhua; Sabitov, Andrey; Yamada, Toshihiko; Yoo, Ji Hye; Yu, Chang Yeon; Long, Stephen P.; Sacks, Erik J.
(2018)
This repository contains datasets and R scripts that were used in a study of the population structure of Miscanthus sacchariflorus in its native range across East Asia. Notably, genotypes of 764 individuals at 34,605 SNPs, called from reduced-representation DNA sequencing using a non-reference bioinformatics pipeline, are provided. Two similar SNP datasets, used for identifying clonal duplicates and for determining the ancestry of ornamental and hybrid Miscanthus plants identified in previous studies respectively, are also provided. There is also a spreadsheet listing the provenance and ploidy of all individuals along with their plastid (chloroplast) haplotypes. Software output for Structure, Treemix, and DIYABC is also included. See README.txt for more information about individual files. Results of this study are described in a manuscript in revision in Annals of Botany by the same authors, "Population structure of Miscanthus sacchariflorus reveals two major polyploidization events, tetraploid-mediated unidirectional introgression from diploid Miscanthus sinensis, and diversity centered around the Yellow Sea."
keywords:
Miscanthus; restriction site-associated DNA sequencing (RAD-seq); single nucleotide polymorphism (SNP); population genetics; Miscanthus xgiganteus; Miscanthus sacchariflorus; R scripts; germplasm; plastid haplotype
published:
2020-05-15
Mishra, Shubhanshu
(2020)
Trained models for multi-task multi-dataset learning for sequence prediction in tweets
Tasks include POS, NER, Chunking, and SuperSenseTagging
Models were trained using: https://github.com/napsternxg/SocialMediaIE/blob/master/experiments/multitask_multidataset_experiment.py
See https://github.com/napsternxg/SocialMediaIE for details.
keywords:
twitter; deep learning; machine learning; trained models; multi-task learning; multi-dataset learning;
published:
2022-05-13
Yan, Bin; Dietrich, Christopher; Yu, Xiaofei; Dai, Renhuai; Maofa, Yang
(2022)
The files are plain text and contain the original data used in phylogenetic analyses of of Typhlocybinae (Bin, Dietrich, Yu, Meng, Dai and Yang 2022: Ecology & Evolution, in press). The three files with extension .phy are text files with aligned DNA sequences in the standard PHYLIP format and correspond to Matrix 1 (amino acid alignment), Matrix 2 (nucleotide alignment of first two codon positions of protein-coding genes) and Matrix 3 (nucleotide alignment of protein-coding genes plus 2 ribosomal genes) described in the Methods section. An additional text file in NEXUS format (.nex extension) contains the morphological character data used in the ancestral state reconstruction (ASCR) analysis described in the Methods. NEXUS is a standard format used by various phylogenetic analysis software. For more information on data file content, see the included "readme" files.
keywords:
Hemiptera; phylogeny; mitochondrial genome; morphology; leafhopper
published:
2025-12-08
Li, Shuai; Moller, Christopher; Mitchell, Noah G.; Martin, Duncan; Sacks, Erik; Saikia, Sampurna; Labonte, Nicholas R.; Baldwin, Brian S.; Morrison, Jesse; Ferguson, John; Leakey, Andrew; Ainsworth, Elizabeth
(2025)
The leaf economics spectrum (LES) describes multivariate correlations in leaf structural, physiological and chemical traits, originally based on diverse C3 species grown under natural ecosystems. However, the specific contribution of C4 species to the global LES is studied less widely. C4 species have a CO2 concentrating mechanism which drives high rates of photosynthesis and improves resource use efficiency, thus potentially pushing them towards the edge of the LES. Here, we measured foliage morphology, structure, photosynthesis, and nutrient content for hundreds of genotypes of the C4 grass Miscanthus × giganteus grown in two common gardens over two seasons. We show substantial trait variations across M. × giganteus genotypes and robust genotypic trait relationships. Compared to the global LES, M. × giganteus genotypes had higher photosynthetic rates, lower stomatal conductance, and less nitrogen content, indicating greater water and photosynthetic nitrogen use efficiency in the C4 species. Additionally, tetraploid genotypes produced thicker leaves with greater leaf mass per area and lower leaf density than triploid genotypes. By expanding the LES relationships across C3 species to include C4 crops, these findings highlight that M. × giganteus occupies the boundary of the global LES and suggest the potential for ploidy to alter LES traits.
keywords:
Feedstock Production;Biomass Analytics;Field Data
published:
2019-03-13
Ando, Amy; Fraterrigo, Jennifer; Guntenspergen, Glenn; Howlader, Aparna; Mallory, Mindy; Olker, Jennifer; Stickley, Samuel
(2019)
keywords:
climate change; conservation; diversification; environmental investments; MPT; porftfolio; risk; uncertainty
published:
2025-09-15
Cheng, Ming-Hsun; Dien, Bruce; Lee, D. K.; Singh, Vijay
(2025)
Chemical-free pretreatments are attracting increased interest because they generate less inhibitor in hydrolysates. In this study, pilot-scaled continuous hydrothermal (PCH) pretreatment followed by disk refining was evaluated and compared to laboratory-scale batch hot water (LHW) pretreatment. Bioenergy sorghum bagasse (BSB) was pretreated at 160-190 °C for 10 min with and without subsequent disk milling. Hydrothermal pretreatment and disk milling synergistically improved glucose and xylose release by 10-20% compared to hydrothermal pretreatment alone. Maximum yields of glucose and xylose of 82.55% and 70.78%, respectively were achieved, when BSB was pretreated at 190 °C and 180 °C followed by disk milling. LHW pretreated BSB had 5-15% higher sugar yields compared to PCH for all pretreatment conditions. The surface area improvement was also performed. PCH pretreatment combined with disk milling increased BSB surface area by 31.80-106.93%, which was greater than observed using LHW pretreatment.
keywords:
Conversion;Sustainability;Genomics;Hydrolysate
published:
2017-08-11
Schiffer, Peter; Le, Brian L.
(2017)
Enclosed in this dataset are transport data of kagome connected artificial spin ice networks composed of permalloy nanowires. The data herein are reproductions of the data seen in Appendix B of the dissertation titled "Magnetotransport of Connected Artificial Spin Ice". Field sweeps with the magnetic field applied in-plane were performed in 5 degree increments for armchair orientation kagome artificial spin ice and zigzag orientation kagome artificial spin ice.
keywords:
Magnetotransport; artificial spin ice; nanowires
published:
2022-03-01
Cao, Yanghui; Dietrich, Christopher H.; Zahniser, James N.; Dmitriev, Dmitry A.
(2022)
The following files were used to reconstruct the phylogeny of the leafhopper subfamily Deltocephalinae, using IQ-TREE v1.6.12 and ASTRAL v 4.10.5.
<b>1) taxon_sampling.csv:</b> contains the sequencing ids (1st column) and the taxonomic information (2nd column) of each sample. Sequencing ids were used in the alignment files and partition files.
<b>2)concatenated_nt.phy:</b> concatenated nucleotide alignment used for the maximum likelihood analysis of Deltocephalinae by IQ-TREE v1.6.12. The file lists the sequences of 163,365 nucleotide positions from 429 genes in 730 samples. Hyphens are used to represent gaps.
<b>3) concatenated_nt_partition.nex:</b> the partitions for the concatenated nucleotide alignment. The file partitions the 163,365 nucleotide characters into 429 character sets, and defines the best substitution model for each character set.
<b>4) concatenated_aa.phy:</b> concatenated amino acid alignment used for the maximum likelihood analysis of Deltocephalinae by IQ-TREE v1.6.12. The file gives the sequences of 53,969 amino acids from 429 genes in 730 samples. Hyphens are used to represent gaps.
<b>5) concatenated_aa_partition.nex:</b> the partitions for the concatenated amino acid alignment. The file partitions the 53,969 characters into 429 character sets, and defines the best substitution model for each character set.
<b>6) concatenated_nt_106taxa.phy:</b> a reduced concatenated nucleotide alignment representing 107 samples x 86 genes. This alignment is used to estimate the divergence times of Deltocephalinae using MCMCTree in PAML v4.9. The file lists the sequences of 79,239 nucleotide positions from 86 genes in 107 samples. Hyphens are used to represent gaps.
<b>7) concatenated_nt_106taxa_partition.nex:</b> the partitions for the nucleotide alignment concatenated_nt_106taxa.phy. The file partitions the 79,239 nucleotide characters into 86 character sets, and defines the best substitution model for each character set.
<b>8) individual_gene_alignment.zip:</b> contains 429 FAS files, one for each of the partitioned nucleotide character sets in the concatenated_nt_partition.nex file. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12, followed by multispecies coalescent analysis using ASTRAL v 4.10.5.
published:
2025-04-23
Gonzalez Mozo, Laura C; Dietrich, Christopher
(2025)
These data files were used for phylogenomic analyses of Darnini and related Membracidae (Hemiptera: Auchenorrhyncha) in the referenced article by Gonzalez-Mozo et al.
- The "mem_50p_alignment.fas" file contains the aligned, concatenated nucleotide sequence data for 51 species and 492 genetic loci included in the phylogenetic analyses ("N" indicates missing data and "-" indicates an alignment gap).
- The file "Table1.rtf" lists the included species, country of origin and genbank accession number. Species newly sequenced for this study have a Sample ID with prefix "DAR"; previously sequenced species for which data were downloaded from genbank have "NCBI" indicated in the same column of the table.
- The file "partition_def.txt" lists the 492 genetic loci included in the alignment with their exact positions indicated by the range of numbers given at the end of each line (e.g., locus "uce-1" occupies positions 1-280 in the alignment).
- The substitution model file "mem_50p.model" contains information on the substitution models used in the partitioned maximum likelihood analysis, including the models used for different data partitions and parameter values, as output by the phylogenetic software IQ-TREE.
- Individual tree files in Newick format (plain text) are provided for the phylogeny from concatenated analysis with the best likelihood score ("mem_50p_bestLikelihoodScore"), concatenated likelihood analysis with gene concordance factors ("mem_50p_gcf") and site concordance factors ("mem_50p_scf").
- The tree file from the ASTRAL analysis is "mem_50p_astral".
- The zip archive entitled “IQ-TREE analysis results.zip” includes output from the maximum likelihood analysis of the concatenated nucleotide sequence data, including the following: (1) main output file “mem_50p.iqtree” summarizing model selection, partitioning schemes, likelihood scores, and run parameters; (2) “mem_50p.mldist” including pairwise ML distances between taxa; (3) “mem_50p.best_scheme.nex” with the best partitioning scheme identified by ModelFinder in NEXUS format and (4) “mem_50p.best_scheme” the RAxM-compatible version of the same file.
- The “Ultrafast bootstrap results.zip” zip archive contains: (1) “mem_50p.ufboot” with the bootstrap replicate trees; (2) “mem_50p.contree” with the majority-rule consensus tree with support values; (3) “mem_50p.splits.nex”, with split support values across the replicates; (4) “mem_50p.log” is the log file.
- The “gene_trees.zip” zip archive contains the individual gene trees as input for subsequent coalescent gene tree analysis in the phylogenetic program ASTRAL.
- The file "DarniniAHE_Character Matrix.csv" contains the data for 6 morphological characters for which the ancestral states were reconstructed using the phylogenetic results from analysis of anchored-hybrid data (see article text for details).
- The file "scriptACRDarnini.txt" contains the commands used to reconstruct ancestral morphological characters states using the corHMM 2.8 R package. See the Methods section of the article for more details.
keywords:
Insecta; Hemiptera; anchored-hybrid enrichment; phylogeny; treehopper
published:
2019-08-15
Simulation data related to the paper "Mastitis risk effect on the economic consequences of paratuberculosis control in dairy cattle: A stochastic modeling study"
keywords:
paratuberculosis;simulation;dairy