Displaying 1 - 25 of 639 in total

Subject Area

Life Sciences (335)
Social Sciences (135)
Physical Sciences (92)
Technology and Engineering (62)
Uncategorized (14)
Arts and Humanities (1)

Funder

Other (193)
U.S. National Science Foundation (NSF) (189)
U.S. Department of Energy (DOE) (64)
U.S. National Institutes of Health (NIH) (60)
U.S. Department of Agriculture (USDA) (42)
Illinois Department of Natural Resources (IDNR) (17)
U.S. Geological Survey (USGS) (6)
U.S. National Aeronautics and Space Administration (NASA) (5)
Illinois Department of Transportation (IDOT) (4)
U.S. Army (2)

Publication Year

2021 (108)
2022 (108)
2020 (96)
2023 (78)
2019 (72)
2018 (62)
2024 (42)
2017 (36)
2016 (30)
2025 (2)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)

License

CC0 (356)
CC BY (263)
custom (20)

Datasets

published: 2019-03-22
 
This data publication provides example video clips related to research on association among flight ability of juvenile songbirds at fledging and juvenile morphological traits (wing emergence, wing length, body condition, mass, and tarsus length. File names reflect the species dropped in each video. These videos are supplemental material for scientific publications by the authors and reflect an example subset of all videos collected form 2017-2018 as part of a larger study on the post-fledging ecology of grassland and shrubland birds in east-Central Illinois, USA. No birds were harmed/injured in the production of these videos and procedures were approved by the Illinois Institutional Animal Care and Use Committee (IACUC), protocol no. 18221. Individuals depicted in the videos have given consent for the videos to be shared (talent/model release form; <a href="https://publicaffairs.illinois.edu/resources/release/">https://publicaffairs.illinois.edu/resources/release/</a>)
keywords: songbirds; flight ability; wing development; wing length; wing emergence; nestling development; post-fledging
published: 2022-10-13
 
The text file contains the original DNA nucleotide sequence data used in the phylogenetic analyses of Xue et al. (in review), comprising the 13 protein-coding genes and 2 ribosomal gene subunits of the mitochondrial genome. The text file is marked up according to the standard NEXUS format commonly used by various phylogenetic analysis software packages. The file will be parsed automatically by a variety of programs that recognize NEXUS as a standard bioinformatics file format. The first six lines of the file identify the file as NEXUS, indicate that the file contains data for 30 taxa (species) and 13078 characters, indicate that the characters are DNA sequence, that gaps inserted into the DNA sequence alignment are indicated by a dash, and that missing data are indicated by a question mark. The positions of data partitions are indicated in the mrbayes block of commands for the phylogenetic program MrBayes (version 3.2.6) beginning near the end of the file. The mrbayes block also contains instructions for MrBayes on various non-default settings for that program. These are explained in the Methods section of the submitted manuscript. Two supplementary tables in the provided PDF file provide additional information on the species in the dataset, including the GenBank accession numbers for the sequence data (Table S1) and the DNA substitution models used for each of the individual mitochondrial genes and for different codon positions of the protein-coding genes used for analyses in the programs MrBayes and IQ-Tree (version 1.6.8) (Table S2). Full citations for references listed in Table S1 can be found by searching GenBank using the corresponding accession number. The supplemental tables will also be linked to the article upon publication at the journal website.
keywords: Hemiptera; phylogeny; mitochondrial genome; morphology; leafhopper
published: 2019-02-19
 
The organizations that contribute to the longevity of 67 long-lived molecular biology databases published in Nucleic Acids Research (NAR) between 1991-2016 were identified to address two research questions 1) which organizations fund these databases? and 2) which organizations maintain these databases? Funders were determined by examining funding acknowledgements in each database's most recent NAR Database Issue update article published (prior to 2017) and organizations operating the databases were determine through review of database websites.
keywords: databases; research infrastructure; sustainability; data sharing; molecular biology; bioinformatics; bibliometrics
published: 2022-10-10
 
Aerial imagery utilized as input in the manuscript "Deep convolutional neural networks exploit high spatial and temporal resolution aerial imagery to predict key traits in miscanthus" . Data was collected over M. Sacchariflorus and Sinensis breeding trials at the Energy Farm, UIUC in 2020. Flights were performed using a DJI M600 mounted with a Micasense Rededge multispectral sensor at 20 m altitude around solar noon. Imagery is available as tif file by field trial and date (10). The post-processing of raw images into orthophoto was performed in Agisoft Metashape software. Each crop surface model and multispectral orthophoto was stacked into an unique raster stack by date and uploaded here. Each raster stack includes 6 layers in the following order: Layer 1 = crop surface model, Layer 2 = Blue, Layer 3 = Green, Layer 4 = Red, Layer 5 = Rededge, and Layer 6 = NIR multispectral bands. Msa raster stacks were resampled to 1.67 cm spatial resolution and Msi raster stacks were resampled to 1.41 cm spatial resolution to ease their integration into further analysis. 'MMDDYYYY' is the date of data collection, 'MSA' is M. Sacchariflorus trial, 'MSI' is Miscanthus Sinensis trial, 'CSM' is crop surface model layer, and 'MULTSP' are the five multispectral bands.
keywords: convolutional neural networks; miscanthus; perennial grasses; bioenergy; field phenotyping; remote sensing; UAV
published: 2023-07-10
 
Bee movement between habitat patches in a naturally fragmented ecosystem depended on species, patch, and matrix variables. Using a mark-recapture methodology in the naturally fragmented Ozark glade ecosystem, we assessed the importance of bee size, nesting biology, the distance between patches (e.g., isolation), and nesting and floral resources in habitat patches and the surrounding matrix on bee movement. This dataset includes seven data files, three R code files, and a QGIS tool. Three of the data files include information collected at the study sites with regard to bees and matrix and patch characteristics. The other four data files are spatial files used to quantify the characteristics of the forest canopy between the study sites and the edge-to-edge distances between the study sites. R code in the R Markdown file recreates the analysis and data presentation for the associated publication. R script files contain processes for calculating some of the explanatory variables used in the analysis. The QGIS tool can be used as the first step to obtaining average values from a raster file where the cells are large relative to the areas of interest (AOI) that you would like to characterize. The second step is contained in one of the aforementioned R scripts. Detected effects included: Larger bees were more likely to move between patches. Bee movement was less likely as the distance between patches increased. However, relatively short distances (~50 m) inhibited movement more than our a priori expectations. Bees were unlikely to move away from home patches with abundant and diverse floral and below-ground nesting resources. When home patches were less resource-rich, bee movement depended on the characteristics of the away patch or the matrix. In these cases, bees were more likely to move to away patches with greater below-ground nesting and floral resources. Matrix habitats with more available floral and below-ground nesting resources appear to impede movement to neighboring patches, potentially because they already provide supplemental resources for bees.
keywords: habitat fragmentation; bees; movement; mark-recapture; nesting resources; floral resources; isolation
published: 2019-05-16
 
This repository includes scripts and datasets for the paper, "Statistically consistent divide-and-conquer pipelines for phylogeny estimation using NJMerge." All data files in this repository are for analyses using the logdet distance matrix computed on the concatenated alignment. Data files for analyses using the average gene-tree internode distance matrix can be downloaded from the Illinois Data Bank (https://doi.org/10.13012/B2IDB-1424746_V1). The latest version of NJMerge can be downloaded from Github (https://github.com/ekmolloy/njmerge).<br /> <strong>List of Changes:</strong> &bull; Updated timings for NJMerge pipelines to include the time required to estimate distance matrices; this impacted files in the following folder: <strong>data.zip</strong> &bull; Replaced "Robinson-Foulds" distance with "Symmetric Difference"; this impacted files in the following folders: <strong> tools.zip; data.zip; scripts.zip</strong> &bull; Added some additional information about the java command used to run ASTRAL-III; this impacted files in the following folders: <strong>data.zip; astral64-trees.tar.gz (new)</strong>
keywords: divide-and-conquer; statistical consistency; species trees; incomplete lineage sorting; phylogenomics
published: 2024-03-25
 
This accompanying study is published under the title "Estimating soil N2O emissions induced by organic and inorganic fertilizer inputs using a Tier-2, regression-based meta-analytic approach for U.S. agricultural lands" at Science of the Total Environment. The study is authored by Dr. Yushu Xia, Dr. Hoyoung Kwon, and Dr. Michelle Wander. The DOI for this study is (TBD). Please refer to the study for detailed data extraction and processing methods.
keywords: soil; nitrous oxide; agriculture; fertilizers; meta-analysis
planned publication date: 2025-01-23
 
These are the responses to an open, convenience sample survey of residents of Illinois to understand their interactions with wild deer. The survey was available on REDCap between December 19, 2022 and December 19, 2023, and was publicized through listserves, Facebook groups, and media reporting. The file "COVID Deer Survey _ REDCap.pdf" contains the codebook for the survey, including the questions; all factor variables have ".factor" added to their name in the dataset. The file "DeerSurveyData.csv" contains the dataset. The file "Score_calculation_for_sharing.R" is the code to create the cleaned dataset used for analysis from the raw survey responses. Throughout, NA is used to represent null/not available/not applicable; this is most likely either a failure to answer the question or, in some cases, a question that was not presented as it is not relevant based on answers to previous questions.
keywords: deer; survey
published: 2023-12-20
 
Important Note: the raw transient files need to be downloaded through this separate link: https://uofi.box.com/s/oagdxhea1wi8tvfij4robj0z0w8wq7j4. Once downloaded, place the file within the within the .d folder in the unzipped 20210930_ShortTransient_S3_5 folder to perform reconstruction step. The minimal datasets to run the computational pipeline MEISTER introduced in the manuscript titled "Integrative Multiscale Biochemical Mapping of the Brain via Deep-Learning-Enhanced High-Throughput Mass Spectrometry". The key steps of our computational pipeline include (1) tissue mass spectrometry imaging (MSI) reconstruction; (2) multimodal image registration and 3D reconstruction; (3) regional analysis; and (4) single-cell and tissue data integration. Detailed protocols to reproduce our results in the manuscript are provided with an example data set shared for learning the protocols. Our computational processing codes are implemented mostly in Python as well as MATLAB (for image registration).
keywords: deep learning;mass spectrometry;single cells
published: 2019-05-31
 
The data are provided to illustrate methods in evaluating systematic transactional data reuse in machine learning. A library account-based recommender system was developed using machine learning processing over transactional data of 383,828 transactions (or check-outs) sourced from a large multi-unit research library. The machine learning process utilized the FP-growth algorithm over the subject metadata associated with physical items that were checked-out together in the library. The purpose of this research is to evaluate the results of systematic transactional data reuse in machine learning. The analysis herein contains a large-scale network visualization of 180,441 subject association rules and corresponding node metrics.
keywords: evaluating machine learning; network science; FP-growth; WEKA; Gephi; personalization; recommender systems
published: 2024-02-21
 
Data associated with the manuscript "Niche conservatism and spread explain hybridization and introgression between native and invasive fish" by Jordan H. Hartman, Joel B. Corush, Eric R. Larson, Jeremy S. Tiemann, Philip Willink, and Mark A. Davis. For this project, we combined results of ecological niche models (ENMs) and next-generation restriction site-associated DNA sequencing (RADseq) to test theories of niche conservatism and biotic resistance on the success of invasion, hybridization, and extent of introgression between native Western Banded Killifish and non-native Eastern Banded Killifish. This dataset provides the sampling locations and number of Banded Killifish in each population, accession numbers for RADseq from the National Center for Biotechnology Information Sequence Read Archive and the assignment of each Banded Killifish, the habitat associations of each population from the ENMs, and the occurrence points used to build the ENMs.
keywords: Banded Killifish; ecological niche model; Fundulus diaphanus; hybrid swarm; invasive species; Laurentian Great Lakes
published: 2023-06-29
 
This database provides estimates of agricultural and food commodity flows [in both tons and $US] between the US and China for the year 2017. Pairwise information is provided between US states and Chinese provinces, and US counties and Chinese provinces for 7 Standardized Classification of Transported Goods (SCTG) commodity categories. Additionally, crosswalks are provided to match Harmonized System (HS) codes and China's Multi-Regional Input Output (MRIO) commodity sectors to their corresponding SCTG commodity codes. The included SCTG commodities are: - SCTG 01: Iive animals and fish - SCTG 02: cereal grains - SCTG 03: agricultural products (except for animal feed, cereal grains, and forage products) - SCTG 04: animal feed, eggs, honey, and other products of animal origin - SCTG 05: meat, poultry, fish, seafood, and their preparations - SCTG 06: milled grain products and preparations, and bakery products - SCTG 07: other prepared foodstuffs, fats and oils For additional information, please see the related paper by Pandit et al. (2022) in Environmental Research Letters. ADD DOI WHEN RECEIVED
keywords: Food flows; High-resolution; County-scale; Bilateral; United States; China
published: 2024-03-25
 
This is the dataset for the manuscript titled, "Differing physiological performance of coexisting cool- and warmwater fish species under heatwaves in the Midwestern United States"
keywords: climate change; heat wave; metabolic rate; swimming; predator-prey interaction; thermal tolerance; Sander vitreus; walleye; largemouth bass; species distributions
published: 2018-09-06
 
The XSEDE program manages the database of allocation awards for the portfolio of advanced research computing resources funded by the National Science Foundation (NSF). The database holds data for allocation awards dating to the start of the TeraGrid program in 2004 to present, with awards continuing through the end of the second XSEDE award in 2021. The project data include lead researcher and affiliation, title and abstract, field of science, and the start and end dates. Along with the project information, the data set includes resource allocation and usage data for each award associated with the project. The data show the transition of resources over a fifteen year span along with the evolution of researchers, fields of science, and institutional representation.
keywords: allocations; cyberinfrastructure; XSEDE
published: 2024-01-19
 
This data set is related to a SoyFACE experiment conducted in 2004, 2006, 2007, and 2008 with the soybean cultivars Loda and HS93-4118. The experiment looked at how seed elements were affected by elevated CO2 and yield. In this V2, 2 new files were added per journal requirement. Total there are 5 data files in text format within the digrado_et_al_gcb_data_V2 and 1 readme file. The name of files are listed below. Details about headers are explained in the readme.txt file. <b>1. ionomic_data.txt file</b> contains the ionomic data (mg/kg) for the two cultivars. The file contains all six technical replicates for each plot. The cultivar, year, treatment, and the plot from which the samples were collected are given for each entry. <b>2. yield_data.txt file</b> contains the yield data for the two cultivars (seed yield in kg/ha, seed yield in bu/a, Protein (%), Oil (%)). The file contains yield data for every plot. The cultivar, year, treatment, and the plot from which the samples were collected are given for each entry. <b>3. mineral_pro_oil_yield.txt file</b> contains the yield per hectare for each mineral (g/ha) along with the yield per hectare for protein and oil (t/ha). This was obtained by multiplying the seed content of each element (minerals, protein, and oil) by the total seed yield. The file contains yield data for every plots. The cultivar, year, treatment, and the plot from which the samples were collected are given for each entry. <b>4. economic_assessment.txt file</b> contains data used to assess the financial impact of altered seed oil content on soybean oil production. <b>5. meteorological_data.txt file</b> contains the meteorological data recorded by a weather station located ~ 3km from the experimental site (Willard Airport Champaign). Data covering the period between May 28 and September 24 were used for 2004; between May 25 and September 24 were used in 2006; between May 23 and September 17 in 2007; and between June 16 and October 24 in 2008.
keywords: protein; oil; mineral; SoyFACE; nutrient; Glycine max; soybean; yield; CO2; agriculture; climate change
published: 2024-03-25
 
Diversity - PubMed dataset Contact: Apratim Mishra (March 22, 2024) This dataset presents article-level (pmid) and author-level (auid) diversity data for PubMed articles. The selection chosen includes articles retrieved from Authority 2018 [1], a total of 228 040 papers and 440 310 authors. The sample of papers is based on the top 40 journals in the dataset, limited to 2-10 authors published between 1990 – 2010, and stratified on paper count per year. Additionally, this dataset is limited to papers where the lead author is affiliated with one of the four countries: the US, the UK, Canada, and Australia. Files are encoded with ‘utf-8’. ################################################ File1: auids_plos.csv (Important columns defined, 7 in total) • AUID: a unique ID for each author • Ethnea: ethnicity prediction • Genni: gender prediction ################################################# File2: pmids_plos.csv (Important columns defined, 33 in total) • pmid: unique paper ID • year: Year of paper publication • no_authors: Author count • journal: Journal name • years: first year of publication for every author • age_bin: Binned age for every author • Country-temporal: Country of affiliation for every author • h_index: Journal h-index • TimeNovelty: Paper Time novelty [2] • nih_funded: Binary variable indicating NIH funding for any author • prior_cit_mean: Mean of all authors’ prior citation rate • Insti_impact_all: All authors’ respective institutions’ citation count • Insti_impact: Maximum of all institutions’ citation count • mesh_vals: Top MeSH values for every author for that paper • outer_mesh_vals: MeSH qualifiers for every author for that paper • relative_citation_ratio: RCR The ‘Readme’ includes a description for all columns. [1] Torvik, Vetle; Smalheiser, Neil (2021): Author-ity 2018 - PubMed author name disambiguated dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2273402_V1 [2] Mishra, Shubhanshu; Torvik, Vetle I. (2018): Conceptual novelty scores for PubMed articles. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5060298_V1
keywords: Diversity; PubMed; Citation
published: 2024-02-27
 
Coups d'Ètat are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d’État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e., realized, unrealized, or conspiracy) the type of actor(s) who initiated the coup (i.e., military, rebels, etc.), as well as the fate of the deposed leader. Version 2.1.3 adds 19 additional coup events to the data set, corrects the date of a coup in Tunisia, and reclassifies an attempted coup in Brazil in December 2022 to a conspiracy. Version 2.1.2 added 6 additional coup events that occurred in 2022 and updated the coding of an attempted coup event in Kazakhstan in January 2022. Version 2.1.1 corrected a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixed this omission by marking the case as both a dissident coup and an auto-coup. Version 2.1.0 added 36 cases to the data set and removed two cases from the v2.0.0 data. This update also added actor coding for 46 coup events and added executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event. Version 2.0.0 improved several aspects of the previous version (v1.0.0) and incorporated additional source material to include: • Reconciling missing event data • Removing events with irreconcilable event dates • Removing events with insufficient sourcing (each event needs at least two sources) • Removing events that were inaccurately coded as coup events • Removing variables that fell below the threshold of inter-coder reliability required by the project • Removing the spreadsheet ‘CoupInventory.xls’ because of inadequate attribution and citations in the event summaries • Extending the period covered from 1945-2005 to 1945-2019 • Adding events from Powell and Thyne’s Coup Data (Powell and Thyne, 2011) <br> <b>Items in this Dataset</b> 1. <i>Cline Center Coup d'État Codebook v.2.1.3 Codebook.pdf</i> - This 15-page document describes the Cline Center Coup d’État Project dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d'État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. <i>Revised February 2024</i> 2. <i>Coup Data v2.1.3.csv</i> - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 1000 observations. <i>Revised February 2024</i> 3. <i>Source Document v2.1.3.pdf</i> - This 325-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. <i>Revised February 2024</i> 4. <i>README.md</i> - This file contains useful information for the user about the dataset. It is a text file written in markdown language. <i>Revised February 2024</i> <br> <b> Citation Guidelines</b> 1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2024. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.3. February 27. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V7 2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access): Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Emilio Soto. 2024. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.3. February 27. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V7
published: 2024-03-28
 
Read me file for the data repository ******************************************************************************* This repository has raw data for the publication "Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process Induced Strain". We arrange the data following the figure in which it first appeared. For all electrical transfer measurement, we provide the up-sweep and down-sweep data, with voltage units in V and conductance unit in S. All Raman modes have unit of cm^-1. ******************************************************************************* How to use this dataset All data in this dataset is stored in binary Numpy array format as .npy file. To read a .npy file: use the Numpy module of the python language, and use np.load() command. Example: suppose the filename is example_data.npy. To load it into a python program, open a Jupyter notebook, or in the python program, run: import numpy as np data = np.load("example_data.npy") Then the example file is stored in the data object. *******************************************************************************
published: 2016-12-13
 
BAM files for founding strain (MG1655-motile) as well as evolved strains from replicate motility selection experiments in low-viscosity agar plates containing either rich medium (LB) or minimal medium (M63+0.18mM galactose)
published: 2022-03-25
 
This upload includes the 16S.B.ALL in 100-HF condition (referred to as 16S.B.ALL-100-HF) used in Experiment 3 of the WITCH paper (currently accepted in principle by the Journal of Computational Biology). 100-HF condition refers to making sequences fragmentary with an average length of 100 bp and a standard deviation of 60 bp. Additionally, we enforced that all fragmentary sequences to have lengths > 50 bp. Thus, the final average length of the fragments is slightly higher than 100 bp (~120 bp). In this case (i.e., 16S.B.ALL-100-HF), 1,000 sequences with lengths 25% around the median length are retained as "backbone sequences", while the remaining sequences are considered "query sequences" and made fragmentary using the "100-HF" procedure. Backbone sequences are aligned using MAGUS (or we extract their reference alignment). Then, the fragmentary versions of the query sequences are added back to the backbone alignment using either MAGUS+UPP or WITCH. More details of the tar.gz file are described in README.txt.
keywords: MAGUS;UPP;Multiple Sequence Alignment;eHMMs
published: 2018-05-06
 
This deposit contains all raw data and analysis from the paper "In-cell titration of small solutes controls protein stability and aggregation". Data is collected into several types: 1) analysis*.tar.gz are the analysis scripts and the resulting data for each cell. The numbers correspond to the numbers shown in Fig.S1. (in publication) 2) scripts.tar.gz contains helper scripts to create the dataset in bash format. 3) input.tar.gz contains headers and other information that is fed into bash scripts to create the dataset. 4) All rawData*.tar.gz are tarballs of the data of cells in different solutes in .mat files readable by matlab, as follows: - Each experiment included in the publication is represented by two matlab files: (1) a calibration jump under amber illumination (_calib.mat suffix) (2) a full jump under blue illumination (FRET data) - Each file contains the following fields: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coordleft - coordinates of cropped and aligned acceptor channel on the original image &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coordright - coordinates of cropped and aligned donor channel on the original image] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dataleft - a 3d 12-bit integer matrix containing acceptor channel flourescence for each pixel and time step. Not available in _calib files &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dataright - a 3d 12-bit integer matrix containing donor channel flourescence for each pixel and time step. This will be mCherry in _calib files and AcGFP in data files. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;frame1 - original image size &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;imgstd - cropped dimensions &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;numFrames - number of frames in dataleft and dataright &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;videos - a structure file containing camera data. Specifically, videos.TimeStamp includes the time from each frame.
keywords: Live cell; FRET microscopy; osmotic challenge; intracellular titrations; protein dynamics
published: 2016-06-06
 
These datasets represent first-time collaborations between first and last authors (with mutually exclusive publication histories) on papers with 2 to 5 authors in years [1988,2009] in PubMed. Each record of each dataset captures aspects of the similarity, nearness, and complementarity between two authors about the paper marking the formation of their collaboration.
published: 2022-05-13
 
The files are plain text and contain the original data used in phylogenetic analyses of of Typhlocybinae (Bin, Dietrich, Yu, Meng, Dai and Yang 2022: Ecology & Evolution, in press). The three files with extension .phy are text files with aligned DNA sequences in the standard PHYLIP format and correspond to Matrix 1 (amino acid alignment), Matrix 2 (nucleotide alignment of first two codon positions of protein-coding genes) and Matrix 3 (nucleotide alignment of protein-coding genes plus 2 ribosomal genes) described in the Methods section. An additional text file in NEXUS format (.nex extension) contains the morphological character data used in the ancestral state reconstruction (ASCR) analysis described in the Methods. NEXUS is a standard format used by various phylogenetic analysis software. For more information on data file content, see the included "readme" files.
keywords: Hemiptera; phylogeny; mitochondrial genome; morphology; leafhopper
published: 2024-03-21
 
Impact assessment is an evolving area of research that aims at measuring and predicting the potential effects of projects or programs. Measuring the impact of scientific research is a vibrant subdomain, closely intertwined with impact assessment. A recurring obstacle pertains to the absence of an efficient framework which can facilitate the analysis of lengthy reports and text labeling. To address this issue, we propose a framework for automatically assessing the impact of scientific research projects by identifying pertinent sections in project reports that indicate the potential impacts. We leverage a mixed-method approach, combining manual annotations with supervised machine learning, to extract these passages from project reports. This is a repository to save datasets and codes related to this project. Please read and cite the following paper if you would like to use the data: Becker M., Han K., Werthmann A., Rezapour R., Lee H., Diesner J., and Witt A. (2024). Detecting Impact Relevant Sections in Scientific Research. The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING). This folder contains the following files: evaluation_20220927.ods: Annotated German passages (Artificial Intelligence, Linguistics, and Music) - training data annotated_data.big_set.corrected.txt: Annotated German passages (Mobility) - training data incl_translation_all.csv: Annotated English passages (Artificial Intelligence, Linguistics, and Music) - training data incl_translation_mobility.csv: Annotated German passages (Mobility) - training data ttparagraph_addmob.txt: German corpus (unannotated passages) model_result_extraction.csv: Extracted impact-relevant passages from the German corpus based on the model we trained rf_model.joblib: The random forest model we trained to extract impact-relevant passages Data processing codes can be found at: https://github.com/khan1792/texttransfer
keywords: impact detection; project reports; annotation; mixed-methods; machine learning
published: 2024-01-31
 
Data associated with the manuscript "Stable isotopes and diet metabarcoding reveal trophic overlap between native and invasive Banded Killifish (Fundulus diaphanus) subspecies." by Jordan H. Hartman, Mark A. Davis, Nicholas J. Iacaruso, Jeremy S. Tiemann, Eric R. Larson. For this project, we sampled six locations in Michigan and Illinois for Eastern and Western Banded Killifish and primary consumers. Using stable isotope analysis we found that Eastern Banded Killifish had higher variance in littoral dependence and trophic position than Western Banded Killifish, but both stable isotope and gut content metabarcoding analyses revealed an overlap in the diet composition and trophic position between the subspecies. This dataset provides the sampling locations, accession numbers for gut content metabarcoding data from the National Center for Biotechnology Information Sequence Read Archive, the assignment of each family used in the gut content metabarcoding analysis as littoral, pelagic, terrestrial, or parasite. and the raw stable isotope data from University of California Davis.
keywords: non-game fish; invasive species; imperiled species; stable isotope analysis; gut content metabarcoding