Displaying 451 - 475 of 781 in total

Illinois Data Bank Dataset Search Results

Dataset Search Results

published: 2020-08-18

Althaus, Scott; Berenbaum, May; Jordan, Jenna; Shalmon, Dan (2020): Replication Data for "No buzz for bees: Media coverage of pollinator decline". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4237085_V1

These data and code enable replication of the findings and robustness checks in "No buzz for bees: Media coverage of pollinator decline," published in Proceedings of the National Academy of Sciences of the United States of America (2020)". In this paper, we find that although widespread declines in insect biomass and diversity are increasing concern within the scientific community, it remains unclear whether attention to pollinator declines has also increased within information sources serving the general public. Examining patterns of journalistic attention to the pollinator population crisis can also inform efforts to raise awareness about the importance of declines of insect species providing ecosystem services beyond pollination. We used the Global News Index developed by the Cline Center for Advanced Social Research at the University of Illinois at Urbana-Champaign to track news attention to pollinator topics in nearly 25 million news items published by two American national newspapers and four international wire services over the past four decades. We provide a link to documentation of the Global News Index in the "relationships with articles, code, o. We found vanishingly low levels of attention to pollinator population topics relative to coverage of climate change, which we use as a comparison topic. In the most recent subset of ~10 million stories published from 2007 to 2019, 1.39% (137,086 stories) refer to climate change/global warming, while only 0.02% (1,780) refer to pollinator populations in all contexts and just 0.007% (679) refer to pollinator declines. Substantial increases in news attention were detectable only in U.S. national newspapers. We also find that while climate change stories appear primarily in newspaper “front sections”, pollinator population stories remain largely marginalized in “science” and “back section” reports. At the same time, news reports about pollinator populations increasingly link the issue to climate change, which might ultimately help raise public awareness to effect needed policy changes.

keywords: News Coverage; Text Analytics; Insects; Pollinator; Cline Center; Cline Center for Advanced Social Research; political; social; political science; Global News Index; Archer; news; mass communication; journalism

published: 2021-05-21

Willson, James; Roddur, Mrinmoy Saha; Baqiao, Liu; Zaharias, Paul; Warnow, Tandy (2021): Data from: "Inferring Species Trees from Gene-Family with Duplication and Loss using Multi-Copy Gene-Family Tree Decomposition". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4050038_V1

Data sets from "Inferring Species Trees from Gene-Family with Duplication and Loss using Multi-Copy Gene-Family Tree Decomposition." It contains trees and sequences simulated with gene duplication and loss under a variety of different conditions. Note: - trees.tar.gz contains the simulated gene-family trees used in our experiments (both true trees from SimPhy as well as trees estimated from alignements). - sequences.tar.gz contains simulated sequence data used for estimating the gene-family trees as well as the concatenation analysis. - biological.tar.gz contains the gene trees used as inputs for the experiments we ran on empirical data sets as well as species trees outputted by the methods we tested on those data sets. - stats.txt list statistics (such as AD, MGTE, and average size) for our simulated model conditions.

keywords: gene duplication and loss; species-tree inference; simulated data;

published: 2021-06-25

Szydlowski, Daniel; Daniels, Melissa; Larson, Eric (2021): Data for Do rusty crayfish (Faxonius rusticus) invasions affect water clarity in north temperate lakes?. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4293962_V1

Data associated with the manuscript "Do rusty crayfish invasions affect water clarity in north temperate lakes?" by Daniel K. Szydlowski, Melissa K. Daniels, and Eric R. lARSON

keywords: chlorophyll a; crayfish; Faxonius rusticus; invasive species; lakes; LandSat; remote sening; rusty crayfish; Secchi disc; water clarity

published: 2021-06-24

Kraft, Mary L.; Yeager, Ashley N.; Weber, Peter K. (2021): NanoSIMS depth profiling data of an MDCK cell. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3927212_V1

This dataset consists of the secondary ion mass spectrometry (SIMS) depth profiling data that was collected with a Cameca NanoSIMS 50 instrument from a 10 micron by 10 micron region on a Madin-Darby canine kidney (MDCK) cell that had been metabolically labeled so most of its sphingolipids and cholesterol contained the rare nitrogen-15 oxygen-18 isotopes, respectively.

keywords: secondary ion mass spectrometry; NanoSIMS; depth profiling; MDCK cell; sphingolipids; cholesterol

published: 2023-10-26

Maffeo, Christopher; Aksimentiev, Aleksei (2023): Simulation trajectories for "A DNA turbine powered by a transmembrane potential across a nanopore". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3458097_V1

Simulation trajectory data and scripts for Nature Nanotechnology manuscript "A DNA turbine powered by a transmembrane potential across a nanopore" that demonstrates a rationally designed nanoscale DNA-origami turbine with three chiral blades that uses a transmembrane electrochemical potential across a nanopore to drive a DNA bundle into sustained unidirectional rotations of up to 10 revolutions/s. Driven by the asymmetric mobility of a DNA duplex, the rotation direction of the turbine is set by its designed chirality and the salinity of the solvent.

keywords: All-atom MD simulation; DNA; nanotechnology; motors and rotors

published: 2020-06-26

Gasparik, Jessica T.; Ye, Qing; Curtis, Jeffrey H.; Presto, Albert A.; Donahue, Neil M.; Sullivan, Ryan C.; West, Matthew; Riemer, Nicole (2020): Data from: Quantifying Errors in the Aerosol Mixing-State Index Based on Limited Particle Sample Size. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2774261_V1

This dataset contains the PartMC-MOSAIC simulations used in the article "Quantifying Errors in the Aerosol Mixing-State Index Based on Limited Particle Sample Size". The 1000 simulations of output data is organized into a series of archived folders, each containing 100 scenarios. Within each scenario directory are 25 NetCDF files, which are the hourly output of a PartMC-MOSAIC simulation containing all information regarding the environment, particle and gas state. This dataset was used to investigate the impact of sample size on determining aerosol mixing state. This data may be useful as a data set for applying different types of estimators.

keywords: Atmospheric aerosols; single-particle measurements; sampling uncertainty; NetCDF

published: 2021-07-10

Xie, Jiayang; Fernandes, Samuel; Mayfield-Jones, Dustin; Erice, Gorka; Choi, Min; Lipka, Alexander; Leakey, Andrew (2021): Optical topometry and machine learning to rapidly phenotype stomatal patterning traits for maize QTL mapping. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8275554_V1

This dataset containes the images of B73xMS71 RIL population used in QTL linkage mapping for maize epidermal traits in year 2016 and 2017. 2016RIL_all_mns.rar and 2017RIL_all_mns.rar: contain raw images produced by Nanofocus lsurf Explorer Optical Topometer (Oberhausen, Germany) at 20X magnification with 0.6 numerical aperture. Files were processed in Nanofocus μsurf analysis extended software (Oberhausen,Germany). 2016RIL_all_TIF.rar and 2017RIL_all_TIF.rar: contain images processed from the Topology layer in each nms file to strengthen the edges of cell outlines, and used in downstream cell detection. 2016RIL_all_detection_result.rar and 2017RIL_all_detection_result.rar: contain images with epidermal cells predicted using the Mask R-CNN model. training data.rar: contain images used for Mask R-CNN model training and validation.

keywords: stomata; Mask R-CNN; cell segmentation; water use efficiency

published: 2022-07-19

Parmar, Dharmeshkumar; Jia, Jin; Shrout, Joshua; Sweedler, Jonathan; Bohn, Paul (2022): Effect of Micro-patterned Mucin on Quinolone and Rhamnolipid Profiles of Mucoid Pseudomonas aeruginosa under Antibiotic Stress . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0382919_V1

#### Details of Pseudomonas aeruginosa biofilm dataset #### ----------------*Folder Structure*------------------------------------- This dataset contains peak intensity tables extracted from mass spectrometry imaging (MSI) data using tools, SCiLS and MSI reader. There are 2 folders in "MSI-Data-Paeruginosa-biofilms-UIUC-DP-JVS-July2022.zip", each folder contains 3 sub-folders as listed below. 1. PellicleBiofilms-and-Supernatant [Pellicle biofilms collected from air-liquid interface and spend supernatant medium after 96 h incubation period]: (1) Full-Scan-Data-96h; (2) MSMS-data-from-C7-Quinolones-96h; and (3) MSMS-data-from-C9-Quinolones-96h 2. StaticBiofilms [Static biofilms grown on mucin surface]: (1) Full-Scan-Data; (2) MSMS-data-from-C7-Quinolones; and (3) MSMS-data-from-C9-Quinolones ----------------*File name*---------------------------------------------- Sample information is included in the file names for easy identification and processing. Attributes covered in file names are explained in the example below. *Example file name "Rep1-Stat-FRD1-mPat-48-FS"* ~ Each unit of information is separated by "-" ~Unit 1 - "Rep1" - Biological replicate ( Rep1, Rep2, and Rep3) ~Unit 2 - "Stat" - Sample type (Stat = Static Biofilm, Pel = Pellicle biofilm, Sup = Supernatant) ~Unit 3 - "FRD1" - Strain (FRD1 = Mucoid strain, PAO1C = Non-mucoid strain) ~Unit 4 - "mPat" - Type of mucin surface used (mPat = patterned mucin surface, mUni = uniform mucin surface) ~Unit 5 - "48" - Sample time point (hours = 48, 72, 96) ~Unit 6 - "FS" - Scan type used in MSI (FS = high resolution full-scan, 260 = targeted MS/MS of C7 quinolones (m/z 260), 288 = targeted MS/MS of C9 quinolones (m/z 288)) ----------------*File structure*------------------------------------------ All MSI data has been exported to CSV format. Each CSV files contains information about scan number, Coordinates (x,y,z), m/z values, extraction window (absolute), and corresponding intensities in the form of a matrix. ----------------*End of Information*--------------------------------------

keywords: mass spectrometry imaging (MSI); biofilm; antibiotic resistance; Pseudomonas aeruginosa; quorum sensing; rhamnolipids

published: 2019-05-22

Lao, Yuyang; Schiffer, Peter (2019): Isolated artificial spin ice kinetics. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0214000_V1

This is the experimental data of isolated nanomagnet islands with or without the presence of large nanomagnet islands. The small islands are made of Permalloy materials with size of 170 nm by 470 nm by 2.5 nm. The systems are measured at a temperature where the small islands are fluctuating around room temperature. The data is recorded as photoemission electron microscopy intensity. More details about the data can be found in the note.txt and Spe_2016.xlsx file. Note: The raw data folders are stored in five volumes during the compression. All five volumes are needed in order to recover the original folder.

keywords: artificial spin ice; magnetism

published: 2021-11-03

Liu, Baqiao; Warnow, Tandy (2021): Data from Scalable Species Tree Inference with External Constraints. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2566000_V1

This dataset contains re-estimated gene trees from the ASTRAL-II [1] simulated datasets. The re-estimated variants of the datasets are called MC6H and MC11H -- they are derived from the MC6 and MC11 conditions from the original data (the MC6 and MC11 names are given by ASTRID [2]). The uploaded files contain the sequence alignments (half-length their original alignments), and the re-estimated species trees using FastTree2. Note: - "mc6h.tar.gz" and "mc11h.tar.gz" contain the sequence alignments and the re-estimated gene trees for the two conditions - the sequence alignments are in the format "all-genes.phylip.splitted.[i].half" where i means that this alignment is for the i-th alignment of the original dataset, but truncating the alignment halving its length - "g1000.trees" under each replicate contains the newline-separated re-estimated gene trees. The gene trees were estimated from the above described alignments using FastTree2 (version 2.1.11) command "FastTree -nt -gtr" [1]: Mirarab, S., & Warnow, T. (2015). ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics, 31(12), i44-i52. [2]: Vachaspati, P., & Warnow, T. (2015). ASTRID: accurate species trees from internode distances. BMC genomics, 16(10), 1-13.

keywords: simulated data; ASTRAL; alignments; gene trees

published: 2022-08-31

Chen, Wenxiang; Zhan, Xun; Yuan, Renliang; Pidaparthy, Saran; Yong, Adrian Xiao Bin; An, Hyosung; Tang, Zhichu; Yin, Kaijun; Patra, Arghya; Jeong, Heonjae; Zhang, Cheng; Ta, Kim; Riedel, Zachary; Stephens, Ryan; Shoemaker, Daniel; Yang, Hong; Gewirth, Andrew; Braun, Paul; Ertekin, Elif; Zuo, Jian-Min; Chen, Qian (2022): Data for Formation and impact of nanoscopic oriented phase domains in electrochemical crystalline electrodes. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4717991_V1

These datasets are for the four-dimensional scanning transmission electron microscopy (4D-STEM) and electron energy loss spectroscopy (EELS) experiments for cathode nanoparticles at different cutoff voltages and in different electrolytes. The raw 4D-STEM experiment datasets were collected by TEM image & analysis software (FEI) and were saved as SER files. The raw 4D-STEM datasets of SER files can be opened and viewed in MATLAB using our analysis software package of imToolBox available at <a href="https://github.com/flysteven/imToolBox">https://github.com/flysteven/imToolBox</a>. The raw EELS datasets were collected by DigitalMicrograph software and were saved as DM4 files. The raw EELS datasets can be opened and viewed in DigitalMicrograph software or using our analysis codes available at <a href="https://github.com/chenlabUIUC/OrientedPhaseDomain">https://github.com/chenlabUIUC/OrientedPhaseDomain</a>. All the datasets are from the work "Formation and impact of nanoscopic oriented phase domains in electrochemical crystalline electrodes" (2022). The 4D-STEM experiment data include four example datasets for cathode nanoparticles collected at different cutoff voltages and in different electrolytes as described below. Each dataset contains a stack of diffraction patterns collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine cathode particle: "Pristine particle 4D-STEM.ser" 2. Cathode particle at the cutoff voltage of 0.09V during discharge at C/10 in the aqueous electrolyte: "Intermediate cutoff0_09V discharge (aqueous) 4D-STEM.ser" 3. Fully discharged cathode particle at C/10 in the aqueous electrolyte: "Fully discharged particle 4D-STEM.ser" 4. Fully discharged cathode particle at C/10 in the dry organic electrolyte: "Fully discharge particle (dry organic electrolyte).ser" The EELS experiment data includes three example datasets for cathode nanoparticles collected at different cutoff voltages during discharge in the aqueous electrolyte (in "EELS datasets.zip") as described below. Each EELS dataset contains the zero-loss and core-loss EELS spectra collected at different probe positions scanned across the cathode nanoparticle. 1. Pristine cathode particle: "Pristine particle EELS.zip" 2. Cathode particle at the cutoff voltage of 0.09V during discharge at C/10 in the aqueous electrolyte: "intermediate discharge (aqueous) EELS.zip" 3. Fully discharged cathode particle at C/10 in the aqueous electrolyte: "fully discharge (aqueous) EELS.zip" The details of the software package and codes that can be used to analyze the 4D-STEM datasets and EELS datasets are available at: https://github.com/chenlabUIUC/OrientedPhaseDomain. Once our paper is formally published, we will update the relationship of these datasets with our paper.

keywords: 4D-STEM; microstructure; phase transformation; strain; cathode; nanoparticle; energy storage

published: 2021-03-06

Lim, Teck Yian; Markowitz, Spencer Abraham; Do, Minh (2021): RaDICaL: A Synchronized FMCW Radar, Depth, IMU and RGB Camera Data Dataset with Low-Level FMCW Radar Signals (ROS bag format). University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3289560_V1

This dataset consists of raw ADC readings from a 3 transmitter 4 receiver 77GHz FMCW radar, together with synchronized RGB camera and depth (active stereo) measurements. The data is grouped into 4 distinct radar configurations: - "indoor" configuration with range <14m - "30m" with range <38m - "50m" with range <63m - "high_res" with doppler resolution of 0.043m/s # Related code https://github.com/moodoki/radical_sdk # Hardware Project Page https://publish.illinois.edu/radicaldata

keywords: radar; FMCW; sensor-fusion; autonomous driving; dataset; RGB-D; object detection; odometry

published: 2021-02-26

Bauder, Javan M; Allen, Maximilian L. (2021): Translocated nuisance American black bear capture histories. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5471143_V1

These data were used in the survival and cause-specific mortality analyses of translocated nuisance American black bear in Wisconsin published in Animal Conservation (Bauder, J.M., N.M. Roberts, D. Ruid, B. Kohn, and M.L. Allen. Accepted. Lower survival of nuisance American black bears (Ursus americanus) is not due to translocation. Animal Conservation). Included are CSV files including each bear's capture history and associated covariates and meta-data for each CSV file. Also included is an example R script of how to conduct the analyses (this R script is also included as supporting information with the published paper).

keywords: black bear; survival; translocation; nuisance wildlife management

published: 2021-03-08

Mickalide, Harry (Avery); Kuehn, Seppe (2021): Data for: Higher-order interaction between species inhibits bacterial invasion of a phototroph-predator microbial community. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0946028_V2

These are abundance dynamics data and simulations for the paper "Higher-order interaction between species inhibits bacterial invasion of a phototroph-predator microbial community". In this V2, data were converted in Python, in addition to MATLAB and more information on how to work with the data was included in the Readme.

keywords: Microbial community; Higher order interaction; Invasion; Algae; Bacteria; Ciliate

published: 2022-02-10

Sharma, Bijay P.; Zhang, Na; DoKyoung, Lee; Heaton, Emily; Delucia, Evan H.; Sacks, Erik J.; Kantola, Ilsa B.; Boersma, Nicholas N.; Long, Stephen P.; Voigt, Thomas B.; Khanna, Madhu (2022): Data for Responsiveness of Miscanthus and Switchgrass Yields to Stand Age and Nitrogen Fertilization: A Meta-regression Analysis. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3580461_V1

The compiled datasets include plot level observations of energy crops (miscanthus and switchgrass) from recent experimental field trials in the US including dry biomass yield, location, state, region, harvest year, growing season degree days (GDD), winter season heating degree days (HDD), growing season cumulative precipitation, annual nitrogen application rate, age of the pant when harvested, National Commodity Crop Productivity Index (NCCPI) values, and cultivar type (switchgrass) from various published and unpublished sources. The stata codes include estimation procedures for four different specifications, i.e., Model A includes deterministic effect without interaction terms; Model B includes deterministic effect with interaction terms (N2, age2, N × age, GDD2, precip2, N × NCCPI); Model C includes deterministic effect with interaction terms, study, and location random effect; Model D includes deterministic effect with interaction terms, harvest year augmented study, and location random effect.

keywords: Age; Miscanthus; Nitrogen; Switchgrass; Yield; Center for Advanced Bioenergy and Bioproducts Innovation

published: 2021-02-10

Stickley, Samuel; Fraterrigo, Jennifer (2021): Microclimatic Temperature and Vegetation Structure in Great Smoky Mountains National Park. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0897344_V1

This dataset consists of microclimatic temperature and vegetation structure maps at a 3-meter spatial resolution across the Great Smoky Mountains National Park. Included are raster models for sub-canopy, near-surface, minimum and maximum temperature averaged across the study period, season, and month during the growing season months of March through November from 2006-2010. Also available are the topographic and vegetation inputs developed for the microclimate models, including LiDAR-derived vegetation height, LiDAR-derived vegetation structure within four height strata, solar insolation, distance-to-stream, and topographic convergence index (TCI).

keywords: microclimate buffering; forest vegetation structure; temperature; Appalachian Mountains; climate downscaling; understory; LiDAR

published: 2021-03-17

Imker, Heidi J; Luong, Hoa; Mischo, William H; Schlembach, Mary C; Wiley, Chris (2021): Data for: An Examination of Data Reuse Practices within Highly Cited Articles of Faculty at a Research University. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2087785_V1

This dataset was developed as part of a study that assessed data reuse. Through bibliometric analysis, corresponding authors of highly cited papers published in 2015 at the University of Illinois at Urbana-Champaign in nine STEM disciplines were identified and then surveyed to determine if data were generated for their article and their knowledge of reuse by other researchers. Second, the corresponding authors who cited those 2015 articles were identified and surveyed to ascertain whether they reused data from the original article and how that data was obtained. The project goal was to better understand data reuse in practice and to explore if research data from an initial publication was reused in subsequent publications.

keywords: data reuse; data sharing; data management; data services; Scopus API

published: 2023-05-02

Larsen, Ryan; Stanke, Kayla L. ; Rund, Laurie; Leyshon, Brian; Louie, Allison; Steelman, Andrew (2023): Dataset for "Automated identification of piglet brain issue from MRI images using Region-Based Convolutional Neural Networks". University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5784165_V1

This dataset includes structural MRI head scans of 32 piglets, at 28 days of age, scanned at the University of Illinois. The dataset also includes manually drawn brain masks of each of the piglets. The dataset also includes brain masks that were generated automatically using Region-Based Convolutional Neural Networks (Mask R-CNN), trained on the manually drawn brain masks.

keywords: Brain extraction; Machine learning; MRI; Piglet; neural networks

published: 2021-10-10

Detmer, Thomas (2021): Temperature, dissolved oxygen, and Secchi depth of Illinois Reservoirs. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1187851_V1

This data set describes temperature, dissolved oxygen, and secchi depth in 1-m interval profiles in the deepest point in 10 Illinois reservoirs between the years 1995 and 2016.

keywords: Water temperature; dissolved oxygen; secchi depth; climate change

published: 2022-09-01

Di Giovanni, Alexander; Ward, Michael (2022): Data and code for investigating embryonic death in wild bird eggs. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5498544_V1

These data and code are associated with a study on differences in the rate of hatching failure of eggs across 14 free-living grassland and shrubland birds. We used a device to measure the embryonic heart rate of eggs and found there was variation across species related to factors such as nest type and nest safety. This work is to be published in Ornithology.

keywords: embryonic death; grassland birds; egg mortality; heart rate

published: 2021-08-12

Ferguson, John; Fernandes, Samuel; Monier, Brandon; Miller, Nathan; Allen, Dylan; Dmitrieva, Anna; Schmuker, Peter; Lozano, Roberto; Valluru, Ravi; Buckler, Edward; Gore, Michael; Brown, Patrick; Spalding, Edgar; Leakey, Andrew (2021): Machine learning enabled phenotyping for GWAS and TWAS of WUE traits in 869 field-grown sorghum accessions. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5565022_V2

This dataset contains the images of a photoperiod sensitive sorghum accession population used for a GWAS/TWAS study of leaf traits related to water use efficiency in 2016 and 2017. *Note: new in this second version is that JPG images outputted from the nms files were added Accessions_2016.zip and Accessions_2017.zip: contain raw images produced by Optical Topometer (nms files) for all sorghum accessions. Images can be opened with Nanofocus μsurf analysis extended software (Oberhausen,Germany). Accessions_2016_jpg.zip and Accessions_2017_jpg.zip: contain jpg images outputted from the nms files and used in the machine learning phenotyping.

keywords: stomata; segmentation; water use efficiency

published: 2021-05-14

Cattai de Godoy, Maria (2021): Miscanthus grass as a novel functional fiber source in extruded feline diets . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-3595148_V1

- The aim of this research was to evaluate the novel dietary fiber source, miscanthus grass, in comparison to traditional fiber sources, and their effects on the microbiota of healthy adult cats. Four dietary treatments, cellulose (CO), miscanthus grass fiber (MF), a blend of miscanthus fiber and tomato pomace (MF+TP), or beet pulp (BP) were evaluated. - The study was conducted using a completely randomized design with twenty-eight neutered adult, domesticated shorthair cats (19 females and 9 males, mean age 2.2 ± 0.03 yr; mean body weight 4.6 ± 0.7 kg, mean body condition score 5.6 ± 0.6). Total DNA from fresh fecal samples was extracted using Mo-Bio PowerSoil kits (MO BIO Laboratories, Inc., Carlsbad, CA). Amplification of the 292 bp-fragment of V4 region from the 16S rRNA gene was completed using a Fluidigm Access Array (Fluidigm Corporation, South San Francisco, CA). Paired-end Illumina sequencing was performed on a MiSeq using v3 reagents (Illumina Inc., San Diego, CA) at the Roy J. Carver Biotechnology Center at the University of Illinois. - Filenames are composed of animal name identifier, diet (BP= beet pulp; CO= cellulose; MF= miscanthus grass fiber; TP= blend of miscanthus fiber and tomato pomace).

keywords: cats; dietary fiber; fecal microbiota; miscanthus grass; nutrient digestibility; postbiotics

published: 2021-05-07

Torvik, Vetle (2021): ORCIDs mapped to PubMed authors. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9246015_V1

The dataset is based on a snapshot of PubMed taken in December 2018 (NLMs baseline 2018 plus updates throughout 2018), and for ORCIDs, primarily, the 2019 ORCID Public Data File https://orcid.org/. Matching an ORCID to an individual author name on a PMID is a non-trivial process. Anyone can create an ORCID and claim to have contributed to any published work. Many records claim too many articles and most claim too few. Even though ORCID records are (most?) often populated by author name searches in popular bibliographic databases, there is no confirmation that the person's name is listed on the article. This dataset is the product of mapping ORCIDs to individual author names on PMIDs, even when the ORCID name does not match any author name on the PMID, and when there are multiple (good) candidate author names. The algorithm avoids assigning the ORCID to an article when there are no good candidates and when there are multiple equally good matches. For some ORCIDs that clearly claim too much, it triggers a very strict matching procedure (for ORCIDs that claim too much but the majority appear correct, e.g., 0000-0002-2788-5457), and sometimes deletes ORCIDs altogether when all (or nearly all) of its claimed PMIDs appear incorrect. When an individual clearly has multiple ORCIDs it deletes the least complete of them (e.g., 0000-0002-1651-2428 vs 0000-0001-6258-4628). It should be noted that the ORCIDs that claim to much are not necessarily due nefarious or trolling intentions, even though a few appear so. Certainly many are are due to laziness, such as claiming everything with a particular last name. Some cases appear to be due to test engineers (e.g., 0000-0001-7243-8157; 0000-0002-1595-6203), or librarians assisting faculty (e.g., ; 0000-0003-3289-5681), or group/laboratory IDs (0000-0003-4234-1746), or having contributed to an article in capacities other than authorship such as an Investigator, an Editor, or part of a Collective (e.g., 0000-0003-2125-4256 as part of the FlyBase Consortium on PMID 22127867), or as a "Reply To" in which case the identity of the article and authors might be conflated. The NLM has, in the past, limited the total number of authors indexed too. The dataset certainly has errors but I have taken great care to fix some glaring ones (individuals who claim to much), while still capturing authors who have published under multiple names and not explicitly listed them in their ORCID profile. The final dataset provides a "matchscore" that could be used for further clean-up. Four files: person.tsv: 7,194,692 rows, including header 1. orcid 2. lastname 3. firstname 4. creditname 5. othernames 6. otherids 7. emails employment.tsv: 2,884,981 rows, including header 1. orcid 2. putcode 3. role 4. start-date 5. end-date 6. id 7. source 8. dept 9. name 10. city 11. region 12 country 13. affiliation education.tsv: 3,202,253 rows, including header 1. orcid 2. putcode 3. role 4. start-date 5. end-date 6. id 7. source 8. dept 9. name 10. city 11. region 12 country 13. affiliation pubmed2orcid.tsv: 13,133,065 rows, including header 1. PMID 2. au_order (author name position on the article) 3. orcid 4. matchscore (see below) 5. source: orcid (2019 ORCID Public Data File https://orcid.org/), pubmed (NLMs distributed XML files), or patci (an earlier version of ORCID with citations processed through the Patci tool) 12,037,375 from orcid; 1,06,5892 from PubMed XML; 29,797 from Patci matchscore: 000: lastname, firstname and middle init match (e.g., Eric T MacKenzie vs 00: lastname, firstname match (e.g., Keith Ward) 0: lastname, firstname reversed match (e.g., Conde Santiago vs Santiago Conde) 1: lastname, first and middle init match (e.g., L. F. Panchenko) 11: lastname and partial firstname match (e.g., Mike Boland vs Michael Boland or Mel Ziman vs Melanie Ziman) 12: lastname and first init match 15: 3 part lastname and firstname match (David Grahame Hardie vs D Grahame Hardie) 2: lastname match and multipart firstname initial match Maria Dolores Suarez Ortega vs M. D. Suarez 22: partial lastname match and firstname match (e.g., Erika Friedmann vs Erika Friedman) 23: e.g., Antonio Garcia Garcia vs A G Garcia 25: Allan Downie vs J A Downie 26: Oliver Racz vs Oliver Bacz 27: Rita Ostrovskaya vs R U Ostrovskaia 29: Andrew Staehelin vs L A Staehlin 3: M Tronko vs N D Tron'ko 4: Sharon Dent (Also known as Sharon Y.R. Dent; Sharon Y Roth; Sharon Yoder) vs Sharon Yoder 45: Okulov Aleksei vs A B Okulov 48: Maria Del Rosario Garcia De Vicuna Pinedo vs R Garcia-Vicuna 49: Anatoliy Ivashchenko vs A Ivashenko 5 = lastname match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Bill Hieb vs W F Hieb 6 = first name match only (weak match but sometimes captures alternative first name for better subsequent matches); e.g., Maria Borawska vs Maria Koscielak 7 = last or first name match on "other names"; e.g., Hromokovska Tetiana (Also known as Gromokovskaia, T. S., Громоковська Тетяна) vs T Gromokovskaia 77: Siva Subramanian vs Kolinjavadi N. Sivasubramanian 88 = no name in orcid but match caught by uniqueness of name across paper (at least 90% and 2 more than next most common name) prefix: C = ambiguity reduced (possibly eliminated) using city match (e.g., H Yang on PMID 24972200) I = ambiguity eliminated by excluding investigators (ie.., one author and one or more investigators with that name) T = ambiguity eliminated using PubMed pos (T for tie-breaker) W = ambiguity resolved by authority2018

published: 2021-06-28

Shen, Chengze; Zaharias, Paul; Warnow, Tandy (2021): MAGUS+eHMMs: Improved Multiple Sequence Alignment Accuracy for Fragmentary Sequences. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2419626_V1

This dataset contains 1) the cleaned version of 11 CRW datasets, 2) RNASim10k dataset in high fragmentation and 3) three CRW datasets (16S.3, 16S.T, 16S.B.ALL) in high fragmentation.

keywords: MAGUS;UPP;Multiple Sequence Alignment;PASTA;eHMMs

published: 2021-08-15

Felix, Hanau; Hannes, Rost; Ochoa, Idoia (2021): mspack-data. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1396774_V2

This data set contains mass spectrometry data used for the publication "mspack: efficient lossless and lossy mass spectrometry data compression".

keywords: mass-spectrometry data; compression; proteomics