Displaying 576 - 600 of 639 in total

Subject Area

Life Sciences (335)
Social Sciences (135)
Physical Sciences (92)
Technology and Engineering (62)
Uncategorized (14)
Arts and Humanities (1)


Other (193)
U.S. National Science Foundation (NSF) (189)
U.S. Department of Energy (DOE) (64)
U.S. National Institutes of Health (NIH) (60)
U.S. Department of Agriculture (USDA) (42)
Illinois Department of Natural Resources (IDNR) (17)
U.S. Geological Survey (USGS) (6)
U.S. National Aeronautics and Space Administration (NASA) (5)
Illinois Department of Transportation (IDOT) (4)
U.S. Army (2)

Publication Year

2021 (108)
2022 (108)
2020 (96)
2023 (78)
2019 (72)
2018 (62)
2024 (42)
2017 (36)
2016 (30)
2025 (2)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)


CC0 (356)
CC BY (263)
custom (20)


published: 2018-04-26
GBS data from soybean lines carrying introgressions from Glycine tomentella. This project is led by Dr. Randy Nelson, USDA scientist at the University of Illinois. Fastq files contain raw Illumina data. Txt files are keyfiles containing barcodes for each genetic entity.
published: 2018-05-01
GBS data for G. max x G. soja crosses, a project led by Dr. Randy Nelson.
published: 2018-05-06
This deposit contains all raw data and analysis from the paper "In-cell titration of small solutes controls protein stability and aggregation". Data is collected into several types: 1) analysis*.tar.gz are the analysis scripts and the resulting data for each cell. The numbers correspond to the numbers shown in Fig.S1. (in publication) 2) scripts.tar.gz contains helper scripts to create the dataset in bash format. 3) input.tar.gz contains headers and other information that is fed into bash scripts to create the dataset. 4) All rawData*.tar.gz are tarballs of the data of cells in different solutes in .mat files readable by matlab, as follows: - Each experiment included in the publication is represented by two matlab files: (1) a calibration jump under amber illumination (_calib.mat suffix) (2) a full jump under blue illumination (FRET data) - Each file contains the following fields:        coordleft - coordinates of cropped and aligned acceptor channel on the original image        coordright - coordinates of cropped and aligned donor channel on the original image]        dataleft - a 3d 12-bit integer matrix containing acceptor channel flourescence for each pixel and time step. Not available in _calib files        dataright - a 3d 12-bit integer matrix containing donor channel flourescence for each pixel and time step. This will be mCherry in _calib files and AcGFP in data files.        frame1 - original image size        imgstd - cropped dimensions        numFrames - number of frames in dataleft and dataright        videos - a structure file containing camera data. Specifically, videos.TimeStamp includes the time from each frame.
keywords: Live cell; FRET microscopy; osmotic challenge; intracellular titrations; protein dynamics
published: 2018-04-23
Self-citation analysis data based on PubMed Central subset (2002-2005) ---------------------------------------------------------------------- Created by Shubhanshu Mishra, Brent D. Fegley, Jana Diesner, and Vetle Torvik on April 5th, 2018 ## Introduction This is a dataset created as part of the publication titled: Mishra S, Fegley BD, Diesner J, Torvik VI (2018) Self-Citation is the Hallmark of Productive Authors, of Any Gender. PLOS ONE. It contains files for running the self citation analysis on articles published in PubMed Central between 2002 and 2005, collected in 2015. The dataset is distributed in the form of the following tab separated text files: * Training_data_2002_2005_pmc_pair_First.txt (1.2G) - Data for first authors * Training_data_2002_2005_pmc_pair_Last.txt (1.2G) - Data for last authors * Training_data_2002_2005_pmc_pair_Middle_2nd.txt (964M) - Data for middle 2nd authors * Training_data_2002_2005_pmc_pair_txt.header.txt - Header for the data * COLUMNS_DESC.txt file - Descriptions of all columns * model_text_files.tar.gz - Text files containing model coefficients and scores for model selection. * results_all_model.tar.gz - Model coefficient and result files in numpy format used for plotting purposes. v4.reviewer contains models for analysis done after reviewer comments. * README.txt file ## Dataset creation Our experiments relied on data from multiple sources including properitery data from [Thompson Rueter's (now Clarivate Analytics) Web of Science collection of MEDLINE citations](<a href="https://clarivate.com/products/web-of-science/databases/">https://clarivate.com/products/web-of-science/databases/</a>). Author's interested in reproducing our experiments should personally request from Clarivate Analytics for this data. However, we do make a similar but open dataset based on citations from PubMed Central which can be utilized to get similar results to those reported in our analysis. Furthermore, we have also freely shared our datasets which can be used along with the citation datasets from Clarivate Analytics, to re-create the datased used in our experiments. These datasets are listed below. If you wish to use any of those datasets please make sure you cite both the dataset as well as the paper introducing the dataset. * MEDLINE 2015 baseline: <a href="https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html">https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html</a> * Citation data from PubMed Central (original paper includes additional citations from Web of Science) * Author-ity 2009 dataset: - Dataset citation: <a href="https://doi.org/10.13012/B2IDB-4222651_V1">Torvik, Vetle I.; Smalheiser, Neil R. (2018): Author-ity 2009 - PubMed author name disambiguated dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4222651_V1</a> - Paper citation: <a href="https://doi.org/10.1145/1552303.1552304">Torvik, V. I., & Smalheiser, N. R. (2009). Author name disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data, 3(3), 1–29. https://doi.org/10.1145/1552303.1552304</a> - Paper citation: <a href="https://doi.org/10.1002/asi.20105">Torvik, V. I., Weeber, M., Swanson, D. R., & Smalheiser, N. R. (2004). A probabilistic similarity metric for Medline records: A model for author name disambiguation. Journal of the American Society for Information Science and Technology, 56(2), 140–158. https://doi.org/10.1002/asi.20105</a> * Genni 2.0 + Ethnea for identifying author gender and ethnicity: - Dataset citation: <a href="https://doi.org/10.13012/B2IDB-9087546_V1">Torvik, Vetle (2018): Genni + Ethnea for the Author-ity 2009 dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9087546_V1</a> - Paper citation: <a href="https://doi.org/10.1145/2467696.2467720">Smith, B. N., Singh, M., & Torvik, V. I. (2013). A search engine approach to estimating temporal changes in gender orientation of first names. In Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries - JCDL ’13. ACM Press. https://doi.org/10.1145/2467696.2467720</a> - Paper citation: <a href="http://hdl.handle.net/2142/88927">Torvik VI, Agarwal S. Ethnea -- an instance-based ethnicity classifier based on geo-coded author names in a large-scale bibliographic database. International Symposium on Science of Science March 22-23, 2016 - Library of Congress, Washington DC, USA. http://hdl.handle.net/2142/88927</a> * MapAffil for identifying article country of affiliation: - Dataset citation: <a href="https://doi.org/10.13012/B2IDB-4354331_V1">Torvik, Vetle I. (2018): MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4354331_V1</a> - Paper citation: <a href="http://doi.org/10.1045/november2015-torvik">Torvik VI. MapAffil: A Bibliographic Tool for Mapping Author Affiliation Strings to Cities and Their Geocodes Worldwide. D-Lib magazine : the magazine of the Digital Library Forum. 2015;21(11-12):10.1045/november2015-torvik</a> * IMPLICIT journal similarity: - Dataset citation: <a href="https://doi.org/10.13012/B2IDB-4742014_V1">Torvik, Vetle (2018): Author-implicit journal, MeSH, title-word, and affiliation-word pairs based on Author-ity 2009. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4742014_V1</a> * Novelty dataset for identify article level novelty: - Dataset citation: <a href="https://doi.org/10.13012/B2IDB-5060298_V1">Mishra, Shubhanshu; Torvik, Vetle I. (2018): Conceptual novelty scores for PubMed articles. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5060298_V1</a> - Paper citation: <a href="https://doi.org/10.1045/september2016-mishra"> Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : The Magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra</a> - Code: <a href="https://github.com/napsternxg/Novelty">https://github.com/napsternxg/Novelty</a> * Expertise dataset for identifying author expertise on articles: * Source code provided at: <a href="https://github.com/napsternxg/PubMed_SelfCitationAnalysis">https://github.com/napsternxg/PubMed_SelfCitationAnalysis</a> **Note: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016.** Check <a href="https://www.nlm.nih.gov/databases/download/pubmed_medline.html">here</a> for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions Additional data related updates can be found at <a href="http://abel.ischool.illinois.edu">Torvik Research Group</a> ## Acknowledgments This work was made possible in part with funding to VIT from <a href="https://projectreporter.nih.gov/project_info_description.cfm?aid=8475017&icde=18058490">NIH grant P01AG039347</a> and <a href="http://www.nsf.gov/awardsearch/showAward?AWD_ID=1348742">NSF grant 1348742</a>. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. ## License Self-citation analysis data based on PubMed Central subset (2002-2005) by Shubhanshu Mishra, Brent D. Fegley, Jana Diesner, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License. Permissions beyond the scope of this license may be available at <a href="https://github.com/napsternxg/PubMed_SelfCitationAnalysis">https://github.com/napsternxg/PubMed_SelfCitationAnalysis</a>.
keywords: Self citation; PubMed Central; Data Analysis; Citation Data;
published: 2018-04-23
Contains a series of datasets that score pairs of tokens (words, journal names, and controlled vocabulary terms) based on how often they co-occur within versus across authors' collections of papers. The tokens derive from four different fields of PubMed papers: journal, affiliation, title, MeSH (medical subject headings). Thus, there are 10 different datasets, one for each pair of token type: affiliation-word vs affiliation-word, affiliation-word vs journal, affiliation-word vs mesh, affiliation-word vs title-word, mesh vs mesh, mesh vs journal, etc. Using authors to link papers and in turn pairs of tokens is an alternative to the usual within-document co-occurrences, and using e.g., citations to link papers. This is particularly striking for journal pairs because a paper almost always appears in a single journal and so within-document co-occurrences are 0, i.e., useless. The tokens are taken from the Author-ity 2009 dataset which has a cluster of papers for each inferred author, and a summary of each field. For MeSH, title-words, affiliation-words that summary includes only the top-20 most frequent tokens after field-specific stoplisting (e.g., university is stoplisted from affiliation and Humans is stoplisted from MeSH). The score for a pair of tokens A and B is defined as follows. Suppose Ai and Bi are the number of occurrences of token A (and B, respectively) across the i-th author's papers, then nA = sum(Ai); nB = sum(Ai) nAB = sum(Ai*Bi) if A not equal B; nAA = sum(Ai*(Ai-1)/2) otherwise nAnB = nA*nB if A not equal B; nAnA = nA*(nA-1)/2 otherwise score = 1000000*nAB/nAnB if A is not equal B; 1000000*nAA/nAnA otherwise Token pairs are excluded when: score < 5, or nA < cut-off, or nB < cut-off, or nAB < cut-offAB. The cut-offs differ for token types and can be inferred from the datasets. For example, cut-off = 200 and cut-offAB = 20 for journal pairs. Each dataset has the following 7 tab-delimited all-ASCII columns 1: score: roughly the number tokens' co-occurrence divided by the total number of pairs, in parts per million (ppm), ranging from 5 to 1,000,000 2: nAB: total number of co-occurrences 3: nAnB: total number of pairs 4: nA: number of occurrences of token A 5: nB: number of occurrences of token B 6: A: token A 7: B: token B We made some of these datasets as early as 2011 as we were working to link PubMed authors with USPTO inventors, where the vocabulary usage is strikingly different, but also more recently to create links from PubMed authors to their dissertations and NIH/NSF investigators, and to help disambiguate PubMed authors. Going beyond explicit (exact within-field match) is particularly useful when data is sparse (think old papers lacking controlled vocabulary and affiliations, or papers with metadata written in different languages) and when making links across databases with different kinds of fields and vocabulary (think PubMed vs USPTO records). We never published a paper on this but our work inspired the more refined measures described in: <a href="https://doi.org/10.1371/journal.pone.0115681">D′Souza JL, Smalheiser NR (2014) Three Journal Similarity Metrics and Their Application to Biomedical Journals. PLOS ONE 9(12): e115681. https://doi.org/10.1371/journal.pone.0115681</a> <a href="http://dx.doi.org/10.5210/disco.v7i0.6654">Smalheiser, N., & Bonifield, G. (2016). Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation. DISCO: Journal of Biomedical Discovery and Collaboration, 7. doi:http://dx.doi.org/10.5210/disco.v7i0.6654</a>
keywords: PubMed; MeSH; token; name disambiguation
published: 2018-04-23
Provides links to Author-ity 2009, including records from principal investigators (on NIH and NSF grants), inventors on USPTO patents, and students/advisors on ProQuest dissertations. Note that NIH and NSF differ in the type of fields they record and standards used (e.g., institution names). Typically an NSF grant spanning multiple years is associated with one record, while an NIH grant occurs in multiple records, for each fiscal year, sub-projects/supplements, possibly with different principal investigators. The prior probability of match (i.e., that the author exists in Author-ity 2009) varies dramatically across NIH grants, NSF grants, and USPTO patents. The great majority of NIH principal investigators have one or more papers in PubMed but a minority of NSF principal investigators (except in biology) have papers in PubMed, and even fewer USPTO inventors do. This prior probability has been built into the calculation of match probabilities. The NIH data were downloaded from NIH exporter and the older NIH CRISP files. The dataset has 2,353,387 records, only includes ones with match probability > 0.5, and has the following 12 fields: 1 app_id, 2 nih_full_proj_nbr, 3 nih_subproj_nbr, 4 fiscal_year 5 pi_position 6 nih_pi_names 7 org_name 8 org_city_name 9 org_bodypolitic_code 10 age: number of years since their first paper 11 prob: the match probability to au_id 12 au_id: Author-ity 2009 author ID The NSF dataset has 262,452 records, only includes ones with match probability > 0.5, and the following 10 fields: 1 AwardId 2 fiscal_year 3 pi_position, 4 PrincipalInvestigators, 5 Institution, 6 InstitutionCity, 7 InstitutionState, 8 age: number of years since their first paper 9 prob: the match probability to au_id 10 au_id: Author-ity 2009 author ID There are two files for USPTO because here we linked disambiguated authors in PubMed (from Author-ity 2009) with disambiguated inventors. The USPTO linking dataset has 309,720 records, only includes ones with match probability > 0.5, and the following 3 fields 1 au_id: Author-ity 2009 author ID 2 inv_id: USPTO inventor ID 3 prob: the match probability of au_id vs inv_id The disambiguated inventors file (uiuc_uspto.tsv) has 2,736,306 records, and has the following 7 fields 1 inv_id: USPTO inventor ID 2 is_lower 3 is_upper 4 fullnames 5 patents: patent IDs separated by '|' 6 first_app_yr 7 last_app_yr
keywords: PubMed; USPTO; Principal investigator; Name disambiguation
published: 2018-04-19
MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. Prepared by Vetle Torvik 2018-04-05 The dataset comes as a single tab-delimited Latin-1 encoded file (only the City column uses non-ASCII characters), and should be about 3.5GB uncompressed. &bull; How was the dataset created? The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016. Check here for information to get PubMed/MEDLINE, and NLMs data <a href ="https://www.nlm.nih.gov/databases/download/pubmed_medline.html">Terms and Conditions</a> &bull; Affiliations are linked to a particular author on a particular article. Prior to 2014, NLM recorded the affiliation of the first author only. However, MapAffil 2016 covers some PubMed records lacking affiliations that were harvested elsewhere, from PMC (e.g., PMID 22427989), NIH grants (e.g., 1838378), and Microsoft Academic Graph and ADS (e.g. 5833220). &bull; Affiliations are pre-processed (e.g., transliterated into ASCII from UTF-8 and html) so they may differ (sometimes a lot; see PMID 27487542) from PubMed records. &bull; All affiliation strings where processed using the MapAffil procedure, to identify and disambiguate the most specific place-name, as described in: <i>Torvik VI. MapAffil: A bibliographic tool for mapping author affiliation strings to cities and their geocodes worldwide. D-Lib Magazine 2015; 21 (11/12). 10p</i> &bull; Look for <a href="https://doi.org/10.1186/s41182-017-0073-6">Fig. 4</a> in the following article for coverage statistics over time: <i>Palmblad M, Torvik VI. Spatiotemporal analysis of tropical disease research combining Europe PMC and affiliation mapping web services. Tropical medicine and health. 2017 Dec;45(1):33.</i> Expect to see big upticks in coverage of PMIDs around 1988 and for non-first authors in 2014. &bull; The code and back-end data is periodically updated and made available for query by PMID at <a href="http://abel.ischool.illinois.edu/">Torvik Research Group</a> &bull; What is the format of the dataset? The dataset contains 37,406,692 rows. Each row (line) in the file has a unique PMID and author postition (e.g., 10786286_3 is the third author name on PMID 10786286), and the following thirteen columns, tab-delimited. All columns are ASCII, except city which contains Latin-1. 1. PMID: positive non-zero integer; int(10) unsigned 2. au_order: positive non-zero integer; smallint(4) 3. lastname: varchar(80) 4. firstname: varchar(80); NLM started including these in 2002 but many have been harvested from outside PubMed 5. year of publication: 6. type: EDU, HOS, EDU-HOS, ORG, COM, GOV, MIL, UNK 7. city: varchar(200); typically 'city, state, country' but could inlude further subvisions; unresolved ambiguities are concatenated by '|' 8. state: Australia, Canada and USA (which includes territories like PR, GU, AS, and post-codes like AE and AA) 9. country 10. journal 11. lat: at most 3 decimals (only available when city is not a country or state) 12. lon: at most 3 decimals (only available when city is not a country or state) 13. fips: varchar(5); for USA only retrieved by lat-lon query to https://geo.fcc.gov/api/census/block/find
keywords: PubMed, MEDLINE, Digital Libraries, Bibliographic Databases; Author Affiliations; Geographic Indexing; Place Name Ambiguity; Geoparsing; Geocoding; Toponym Extraction; Toponym Resolution
published: 2018-03-01
Data were used to analyze patterns in predator-specific nest predation on shrubland birds in Illinois as related to landscape composition at multiple landscape scales. Data were used in a Journal of Applied Ecology research paper of the same name. Data were collected between 2011 and 2014 at sites in east-central and northeastern Illinois, USA as part of a Ph.D. research project on the relationship between avian nest predation and landscape characteristics, and how nest predation affects adult and nestling bird behavior.
keywords: nest predation; avian ecology; land cover; landscape composition; landscape scale; nest camera; nest survival; predator-specific mortality; scale-dependence; scrubland; shrub-nesting bird
published: 2017-12-01
This dataset contains all the numerical results (digital elevation models) that are presented in the paper "Landscape evolution models using the stream power incision model show unrealistic behavior when m/n equals 0.5." The paper can be found at: http://www.earth-surf-dynam-discuss.net/esurf-2017-15/ The paper has been accepted, but the most up to date version may not be available at the link above. If so, please contact Jeffrey Kwang at jeffskwang@gmail.com to obtain the most up to date manuscript.
keywords: landscape evolution models; digital elelvation model
published: 2017-12-04
Data used for Zaya et al. (2018), published in Invasive Plant Science and Management DOI 10.1017/inp.2017.37, are made available here. There are three spreadsheet files (CSV) available, as well as a text file that has detailed descriptions for each file ("readme.txt"). One spreadsheet file ("prices.csv") gives pricing information, associated with Figure 3 in Zaya et al. (2018). The other two spreadsheet files are associated with the genetic analysis, where one file contains raw data for biallelic microsatellite loci ("genotypes.csv") and the other ("structureResults.csv") contains the results of Bayesian clustering analysis with the program STRUCTURE. The genetic data may be especially useful for future researchers. The genetic data contain the genotypes of the horticultural samples that were the focus of the published article, and also genotypes of nearly 400 wild plants. More information on the location of the wild plant collections can be found in the Supplemental information for Zaya et al. (2015) Biological Invasions 17:2975–2988 DOI 10.1007/s10530-015-0926-z. See "readme.txt" for more information.
keywords: Horticultural industry; invasive species; microsatellite DNA; mislabeling; molecular testing
published: 2017-12-15
These are the results of an 8 month cohort study in two commercial dairy herds in Northwest Illinois. From each herd, 50 cows were selected at random, stratified over lactations 1 to 3. Serum from these animals was collected every two months and tested for antibodies to Bovine Leukosis Virus, Neospora caninum, and Mycobacterium avium subsp. paratuberculosis. Animals that left the herd during the study were replaced by another animal in the same herd and lactation. At the last sampling, serum neutralization assays were performed for Bovine Herpesvirus type 1 and Bovine Viral Diarrhea virus type 1 and 2. Production data before and after sampling was collected for the entire herd from PCdart.
keywords: serostatus;dairy;production;cohort
published: 2017-12-18
This dataset matches to a thesis of the same title: Can fair use be adequately taught to Librarians? Assessing Librarians' confidence and comprehension in explaining fair use following an expert workshop.
keywords: fair use; copyright
published: 2017-12-14
Objectives: This study follows-up on previous work that began examining data deposited in an institutional repository. The work here extends the earlier study by answering the following lines of research questions: (1) what is the file composition of datasets ingested into the University of Illinois at Urbana-Champaign campus repository? Are datasets more likely to be single file or multiple file items? (2) what is the usage data associated with these datasets? Which items are most popular? Methods: The dataset records collected in this study were identified by filtering item types categorized as "data" or "dataset" using the advanced search function in IDEALS. Returned search results were collected in an Excel spreadsheet to include data such as the Handle identifier, date ingested, file formats, composition code, and the download count from the item's statistics report. The Handle identifier represents the dataset record's persistent identifier. Composition represents codes that categorize items as single or multiple file deposits. Date available represents the date the dataset record was published in the campus repository. Download statistics were collected via a website link for each dataset record and indicates the number of times the dataset record has been downloaded. Once the data was collected, it was used to evaluate datasets deposited into IDEALS. Results: A total of 522 datasets were identified for analysis covering the period between January 2007 and August 2016. This study revealed two influxes occurring during the period of 2008-2009 and in 2014. During the first time frame a large number of PDFs were deposited by the Illinois Department of Agriculture. Whereas, Microsoft Excel files were deposited in 2014 by the Rare Books and Manuscript Library. Single file datasets clearly dominate the deposits in the campus repository. The total download count for all datasets was 139,663 and the average downloads per month per file across all datasets averaged 3.2. Conclusion: Academic librarians, repository managers, and research data services staff can use the results presented here to anticipate the nature of research data that may be deposited within institutional repositories. With increased awareness, content recruitment, and improvements, IRs can provide a viable cyberinfrastructure for researchers to deposit data, but much can be learned from the data already deposited. Awareness of trends can help librarians facilitate discussions with researchers about research data deposits as well as better tailor their services to address short-term and long-term research needs.
keywords: research data; research statistics; institutional repositories; academic libraries
published: 2017-12-20
The dataset contains processed model fields used to generate data, figures and tables in the Journal of Geophysical Research article "Investigating the linear dependence of direct and indirect radiative forcing on emission of carbonaceous aerosols in a global climate model." The processed data are monthly averaged cloud properties (CCN, CDNC and LWP) and forcing variables (DRF and IRF) at original CAM5 spatial resolution (1.9° by 2.5°). Raw model output fields from CAM5 simulations are available through NERSC upon request. Please find more detailed information in the ReadMe file.
keywords: carbonaceous aerosols; radiative forcing; emission; linearity
published: 2018-01-13
This dataset provides the time series (Aug. - Sep. 2016) data of sun-induced chlorophyll fluorescence, photosynthesis, photosynthetically active radiation, and associated vegetation indices that were collected in a soybean field in the farm of University of Illinois at Urbana and Champaign. Data contain 255 records and 6 variables (PPFD-IN: Photosynthetically active radiation; GPP-Gross Primary Production; SIF: Sun-Induced Fluorescence; NDVI: Normalized Difference Vegetation Index; Rededge: Rededge Index; Redege_NDVI: Rededge Normalized Difference Vegetation Index). The timestamp uses the standard time. Data are available from 8 am to 4 pm (corresponding to 9 am to 5 pm local time) every day.
keywords: sun-induced chlorophyll fluorescence; photosynthesis; soybean
published: 2018-02-22
Datasets used in the study, "OCTAL: Optimal Completion of Gene Trees in Polynomial Time," under review at Algorithms for Molecular Biology. Note: DS_STORE file in 25gen-10M folder can be disregarded.
keywords: phylogenomics; missing data; coalescent-based species tree estimation; gene trees
published: 2017-06-16
Table S1. Pollen types identified in the BCI and PNSL pollen rain data sets. Pollen types were identified to species when possible and assigned a life form based on descriptions provided in Croat, T.B. (1978). Taxa from BCI and PNSL were assigned a 1 if present in forest census data or a 0 if absent. The relative representation of each taxon has been provided for each extended record and by dry and wet season representation respectively. CA loadings are provided for axes 1 and 2 (Fig. 1).
keywords: pollen; identifications; abundance; data; BCI; PNSL; Panama
published: 2016-06-23
This dataset contains hourly traffic estimates (speeds) for individual links of the New York City road network for the years 2010-2013, estimated from New York City Taxis.
keywords: traffic estimates; traffic conditions; New York City
published: 2017-10-11
The International Registry of Reproductive Pathology Database is part of pioneering work done by Dr. Kenneth McEntee to comprehensively document thousands of disease cases studies. His large and comprehensive collection of case reports and physical samples was complimented by development of the International Registry of Reproductive Pathology Database in the 1980s. The original FoxPro Database files and a migrated access version were completed by the College of Veterinary Medicine in 2016. Access CSV files were completed by the University of Illinois Library in 2017.
keywords: Animal Pathology; Databases; Veterinary Medicine
published: 2017-11-15
Monthly water withdrawal records (total pumpage and per-capita consumption) for the City of Austin, Texas (2000-2014). Data were provided by Austin Water Utility.
keywords: Water use; Water conservation
published: 2017-10-10
This dataset contains ground motion data for Newmark Structural Engineering Laboratory (NSEL) Report Series 048, "Modification of ground motions for use in Central North America: Southern Illinois surface ground motions for structural analysis". The data are 20 individual ground motion time history records developed at each of the 10 sites (for a total of 200 ground motions). These accompanying ground motions are developed following the detailed procedure presented in Kozak et al. [2017].
keywords: earthquake engineering; ground motion records; southern Illinois seismic hazard; dynamic structural analysis; conditional mean spectrum
published: 2017-09-28
This is the dataset used in the Journal of Ecology publication of the same name. It is a site by species matrix of species relative abundances. The file BH.veg.data.csv contains a site by species matrix of species relative abundance (percent cover across all sampling quadrats within site). Data under the heading Year refers to sampling periods. Year 1 refers to the first set of samples taken between 1997 and 2000, Year 2 refers to the second set taken between 2002 and 2005, Year 3 refers to the third set taken between 2007 and 2010, and Year 4 refers to the fourth set taken between 2012 and 2015. All sites met Critical Trends Assessment Program (CTAP) size criteria of being at least 2 ha in size with a minimum of 500 m2 of suitable sampling area. The data in file BH.site.location.csv contains Public Land Survey System ranges and townships in which specific sites were located. All sites were located within the U.S. state of Illinois. More information about this dataset: Interested parties can request data from the Critical Trends Assessment Program, which was the source for the data on the wetlands in this study. More information on the program and data requests can be obtained by visiting the program webpage. Critical Trends Assessment Program, Illinois Natural History Survey. http://wwx.inhs.illinois.edu/research/ctap/
keywords: biodiversity; biotic homogenization; invasive species; Phalaris arundinacea; plant population and community dynamics; similarity index; wetlands
published: 2017-09-26
This file contains the supplemental appendix for the article "Farmer Preferences for Agricultural Soil Carbon Sequestration Schemes" published in Applied Economic Policy and Perspectives (accepted 2017).
keywords: appendix; carbon sequestration; tillage; choice experiment