Displaying 26 - 50 of 668 in total

Subject Area

Life Sciences (365)
Social Sciences (136)
Physical Sciences (101)
Technology and Engineering (64)
Arts and Humanities (1)
Uncategorized (1)

Funder

Other (206)
U.S. National Science Foundation (NSF) (193)
U.S. Department of Energy (DOE) (68)
U.S. National Institutes of Health (NIH) (63)
U.S. Department of Agriculture (USDA) (44)
Illinois Department of Natural Resources (IDNR) (17)
U.S. Geological Survey (USGS) (7)
U.S. National Aeronautics and Space Administration (NASA) (6)
Illinois Department of Transportation (IDOT) (4)
U.S. Army (2)

Publication Year

2021 (108)
2022 (108)
2020 (96)
2023 (78)
2019 (72)
2024 (70)
2018 (61)
2017 (36)
2016 (30)
2025 (4)
2009 (1)
2011 (1)
2012 (1)
2014 (1)
2015 (1)

License

CC0 (367)
CC BY (281)
custom (20)

Datasets

published: 2023-03-16
 
Curated networks and clustering output from the manuscript: Well-Connected Communities in Real-World Networks https://arxiv.org/abs/2303.02813
keywords: Community detection; clustering; open citations; scientometrics; bibliometrics
published: 2024-06-17
 
Data includes carbon mineralization rates, potential denitrification rates, net nitrous oxide fluxes, and soil chemical properties from a laboratory incubation of soil samples collected from 20 locations across an Illinois maize field.
keywords: denitrification; nitrous oxide; dissolved organic carbon; maize
published: 2020-02-23
 
Citation context annotation for papers citing retracted paper Matsuyama 2005 (RETRACTED: Matsuyama W, Mitsuyama H, Watanabe M, Oonakahara KI, Higashimoto I, Osame M, Arimura K. Effects of omega-3 polyunsaturated fatty acids on inflammatory markers in COPD. Chest. 2005 Dec 1;128(6):3817-27.), retracted in 2008 (Retraction in: Chest (2008) 134:4 (893) <a href="https://doi.org/10.1016/S0012-3692(08)60339-6">https://doi.org/10.1016/S0012-3692(08)60339-6<a/> ). This is part of the supplemental data for Jodi Schneider, Di Ye, Alison Hill, and Ashley Whitehorn. "Continued Citation of a Fraudulent Clinical Trial Report, Eleven Years after it was retracted for Falsifying Data" [R&R under review with Scientometrics]. Overall we found 148 citations to the retracted paper from 2006 to 2019, However, this dataset does not include the annotations described in the 2015. in Ashley Fulton, Alison Coates, Marie Williams, Peter Howe, and Alison Hill. "Persistent citation of the only published randomized controlled trial of omega-3 supplementation in chronic obstructive pulmonary disease six years after its retraction." Publications 3, no. 1 (2015): 17-26. In this dataset 70 new and newly found citations are listed: 66 annotated citations and 4 pending citations (non-annotated since we don't have full-text). "New citations" refer to articles published from March 25, 2014 to 2019, found in Google Scholar and Web of Science. "Newly found citations" refer articles published 2006-2013, found in Google Scholar and Web of Science, but not previously covered in Ashley Fulton, Alison Coates, Marie Williams, Peter Howe, and Alison Hill. "Persistent citation of the only published randomised controlled trial of omega-3 supplementation in chronic obstructive pulmonary disease six years after its retraction." Publications 3, no. 1 (2015): 17-26. NOTES: This is Unicode data. Some publication titles & quotes are in non-Latin characters and they may contain commas, quotation marks, etc. FILES/FILE FORMATS Same data in two formats: 2006-2019-new-citation-contexts-to-Matsuyama.csv - Unicode CSV (preservation format only) 2006-2019-new-citation-contexts-to-Matsuyama.xlsx - Excel workbook (preferred format) ROW EXPLANATIONS 70 rows of data - one citing publication per row COLUMN HEADER EXPLANATIONS Note - processing notes Annotation pending - Y or blank Year Published - publication year ID - ID corresponding to the network analysis. See Ye, Di; Schneider, Jodi (2019): Network of First and Second-generation citations to Matsuyama 2005 from Google Scholar and Web of Science. University of Illinois at Urbana-Champaign. <a href="https://doi.org/10.13012/B2IDB-1403534_V2">https://doi.org/10.13012/B2IDB-1403534_V2</a> Title - item title (some have non-Latin characters, commas, etc.) Official Translated Title - item title in English, as listed in the publication Machine Translated Title - item title in English, translated by Google Scholar Language - publication language Type - publication type (e.g., bachelor's thesis, blog post, book chapter, clinical guidelines, Cochrane Review, consumer-oriented evidence summary, continuing education journal article, journal article, letter to the editor, magazine article, Master's thesis, patent, Ph.D. thesis, textbook chapter, training module) Book title for book chapters - Only for a book chapter - the book title University for theses - for bachelor's thesis, Master's thesis, Ph.D. thesis - the associated university Pre/Post Retraction - "Pre" for 2006-2008 (means published before the October 2008 retraction notice or in the 2 months afterwards); "Post" for 2009-2019 (considered post-retraction for our analysis) Identifier where relevant - ISBN, Patent ID, PMID (only for items we considered hard to find/identify, e.g. those without a DOI-based URL) URL where available - URL, ideally a DOI-based URL Reference number/style - reference Only in bibliography - Y or blank Acknowledged - If annotated, Y, Not relevant as retraction not published yet, or N (blank otherwise) Positive / "Poor Research" (Negative) - P for positive, N for negative if annotated; blank otherwise Human translated quotations - Y or blank; blank means Google scholar was used to translate quotations for Translated Quotation X Specific/in passing (overall) - Specific if any of the 5 quotations are specific [aggregates Specific / In Passing (Quotation X)] Quotation 1 - First quotation (or blank) (includes non-Latin characters in some cases) Translated Quotation 1 - English translation of "Quotation 1" (or blank) Specific / In Passing (Quotation 1) - Specific if "Quotation 1" refers to methods or results of the Matsuyama paper (or blank) What is referenced from Matsuyama (Quotation 1) - Methods; Results; or Methods and Results - blank if "Quotation 1" not specific, no associated quotation, or not yet annotated Quotation 2 - Second quotation (includes non-Latin characters in some cases) Translated Quotation 2 - English translation of "Quotation 2" Specific / In Passing (Quotation 2) - Specific if "Quotation 2" refers to methods or results of the Matsuyama paper (or blank) What is referenced from Matsuyama (Quotation 2) - Methods; Results; or Methods and Results - blank if "Quotation 2" not specific, no associated quotation, or not yet annotated Quotation 3 - Third quotation (includes non-Latin characters in some cases) Translated Quotation 3 - English translation of "Quotation 3" Specific / In Passing (Quotation 3) - Specific if "Quotation 3" refers to methods or results of the Matsuyama paper (or blank) What is referenced from Matsuyama (Quotation 3) - Methods; Results; or Methods and Results - blank if "Quotation 3" not specific, no associated quotation, or not yet annotated Quotation 4 - Fourth quotation (includes non-Latin characters in some cases) Translated Quotation 4 - English translation of "Quotation 4" Specific / In Passing (Quotation 4) - Specific if "Quotation 4" refers to methods or results of the Matsuyama paper (or blank) What is referenced from Matsuyama (Quotation 4) - Methods; Results; or Methods and Results - blank if "Quotation 4" not specific, no associated quotation, or not yet annotated Quotation 5 - Fifth quotation (includes non-Latin characters in some cases) Translated Quotation 5 - English translation of "Quotation 5" Specific / In Passing (Quotation 5) - Specific if "Quotation 5" refers to methods or results of the Matsuyama paper (or blank) What is referenced from Matsuyama (Quotation 5) - Methods; Results; or Methods and Results - blank if "Quotation 5" not specific, no associated quotation, or not yet annotated Further Notes - additional notes
keywords: citation context annotation, retraction, diffusion of retraction
published: 2021-07-22
 
This dataset includes five files. Descriptions of the files are given as follows: <b>FILENAME: PubMed_retracted_publication_full_v3.tsv</b> - Bibliographic data of retracted papers indexed in PubMed (retrieved on August 20, 2020, searched with the query "retracted publication" [PT] ). - Except for the information in the "cited_by" column, all the data is from PubMed. - PMIDs in the "cited_by" column that meet either of the two conditions below have been excluded from analyses: [1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file). [2] Citing paper and the cited retracted paper have the same PMID. ROW EXPLANATIONS - Each row is a retracted paper. There are 7,813 retracted papers. COLUMN HEADER EXPLANATIONS 1) PMID - PubMed ID 2) Title - Paper title 3) Authors - Author names 4) Citation - Bibliographic information of the paper 5) First Author - First author's name 6) Journal/Book - Publication name 7) Publication Year 8) Create Date - The date the record was added to the PubMed database 9) PMCID - PubMed Central ID (if applicable, otherwise blank) 10) NIHMS ID - NIH Manuscript Submission ID (if applicable, otherwise blank) 11) DOI - Digital object identifier (if applicable, otherwise blank) 12) retracted_in - Information of retraction notice (given by PubMed) 13) retracted_yr - Retraction year identified from "retracted_in" (if applicable, otherwise blank) 14) cited_by - PMIDs of the citing papers. (if applicable, otherwise blank) Data collected from iCite. 15) retraction_notice_pmid - PMID of the retraction notice (if applicable, otherwise blank) <b>FILENAME: PubMed_retracted_publication_CitCntxt_withYR_v3.tsv</b> - This file contains citation contexts (i.e., citing sentences) where the retracted papers were cited. The citation contexts were identified from the XML version of PubMed Central open access (PMCOA) articles. - This is part of the data from: Hsiao, T.-K., & Torvik, V. I. (manuscript in preparation). Citation contexts identified from PubMed Central open access articles: A resource for text mining and citation analysis. - Citation contexts that meet either of the two conditions below have been excluded from analyses: [1] PMIDs of the citing papers are from retraction notices (i.e., those in the “retraction_notice_PMID.csv” file). [2] Citing paper and the cited retracted paper have the same PMID. ROW EXPLANATIONS - Each row is a citation context associated with one retracted paper that's cited. - In the manuscript, we count each citation context once, even if it cites multiple retracted papers. COLUMN HEADER EXPLANATIONS 1) pmcid - PubMed Central ID of the citing paper 2) pmid - PubMed ID of the citing paper 3) year - Publication year of the citing paper 4) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, tbl_fig_caption = tables and table/figure captions) 5) IMRaD - IMRaD section of the citation context (I = Introduction, M = Methods, R = Results, D = Discussions/Conclusion, NoIMRaD = not identified) 6) sentence_id - The ID of the citation context in a given location. For location information, please see column 4. The first sentence in the location gets the ID 1, and subsequent sentences are numbered consecutively. 7) total_sentences - Total number of sentences in a given location 8) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper. 9) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper. 10) citation - The citation context 11) progression - Position of a citation context by centile within the citing paper. 12) retracted_yr - Retraction year of the retracted paper 13) post_retraction - 0 = not post-retraction citation; 1 = post-retraction citation. A post-retraction citation is a citation made after the calendar year of retraction. <b>FILENAME: 724_knowingly_post_retraction_cit.csv</b> (updated) - The 724 post-retraction citation contexts that we determined knowingly cited the 7,813 retracted papers in "PubMed_retracted_publication_full_v3.tsv". - Two citation contexts from retraction notices have been excluded from analyses. ROW EXPLANATIONS - Each row is a citation context. COLUMN HEADER EXPLANATIONS 1) pmcid - PubMed Central ID of the citing paper 2) pmid - PubMed ID of the citing paper 3) pub_type - Publication type collected from the metadata in the PMCOA XML files. 4) pub_type2 - Specific article types. Please see the manuscript for explanations. 5) year - Publication year of the citing paper 6) location - Location of the citation context (abstract = abstract, body = main text, back = supporting material, table_or_figure_caption = tables and table/figure captions) 7) intxt_id - Identifier of a cited paper. Here, a cited paper is the retracted paper. 8) intxt_pmid - PubMed ID of a cited paper. Here, a cited paper is the retracted paper. 9) citation - The citation context 10) retracted_yr - Retraction year of the retracted paper 11) cit_purpose - Purpose of citing the retracted paper. This is from human annotations. Please see the manuscript for further information about annotation. 12) longer_context - A extended version of the citation context. (if applicable, otherwise blank) Manually pulled from the full-texts in the process of annotation. <b>FILENAME: Annotation manual.pdf</b> - The manual for annotating the citation purposes in column 11) of the 724_knowingly_post_retraction_cit.tsv. <b>FILENAME: retraction_notice_PMID.csv</b> (new file added for this version) - A list of 8,346 PMIDs of retraction notices indexed in PubMed (retrieved on August 20, 2020, searched with the query "retraction of publication" [PT] ).
keywords: citation context; in-text citation; citation to retracted papers; retraction
published: 2024-06-04
 
This dataset contains files and relevant metadata for real-world and synthetic LFR networks used in the manuscript "Well-Connectedness and Community Detection (2024) Park et al. presently under review at PLOS Complex Systems. The manuscript is an extended version of Park, M. et al. (2024). Identifying Well-Connected Communities in Real-World and Synthetic Networks. In Complex Networks & Their Applications XII. COMPLEX NETWORKS 2023. Studies in Computational Intelligence, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-031-53499-7_1. “The Overview of Real-World Networks image provides high-level information about the seven real-world networks. TSVs of the seven real-world networks are provided as [network-name]_cleaned to indicate that duplicated edges and self-loops were removed, where column 1 is source and column 2 is target. LFR datasets are contained within the zipped file. Real-world networks are labeled _cleaned_ to indicate that duplicate edges and self loops were removed. #LFR datasets for the Connectivity Modifier (CM) paper ### File organization Each directory `[network-name]_[resolution-value]_lfr` includes the following files: * `network.dat`: LFR network edge-list * `community.dat`: LFR ground-truth communities * `time_seed.dat`: time seed used in the LFR software * `statistics.dat`: statistics generated by the LFR software * `cmd.stat`: command used to run the LFR software as well as time and memory usage information
planned publication date: 2025-06-06
 
The materials used to provide Continuing Medical Education on ticks and tick-borne diseases in Illinois on February 1, 2023 at Carle Hospital, along with the pre- and post-quiz and deidentified data of the quiz takers. Files: "Ticks and Tick-borne Diseases of Illinois_Final_w_speaker_notes.pptx": Presentation slides used for CME course, with notes to indicate verbal commentary "CME assessment_final.docx": Pre- and post-CME quiz questions and answers, annotated to indicate correct answers and reasoning for incorrect answers "CME_prequiz_data_for_sharing.csv": De-identified data from pre-CME quiz "CME_postquiz_data_for_sharing.csv": De-identified data from post-CME quiz, including demographics "DataCleaning_forSharing.R": R file used to clean the raw data and calculate the scores "ReadMe.txt":
keywords: tick-borne disease; CME
planned publication date: 2024-07-31
 
This dataset contains all data and supplementary materials from "Improving precision and accuracy of genetic mapping with genotyping-by-sequencing data in outcrossing species". An Excel file a list of all QTLs and linkage group length (in cM) obtained with two different SNP-calling methods (Tassel-Uneak and Tassel-GBS), genetic map-construction method (linkage-only and reference order-corrected) and depth filters (12x, 20x, 30x and 40x) for genetic mapping of 18 biomass yield traits in a biparental Miscanthus sinensis population using RAD-Seq SNPs is provided as "Supplementary file 1". A Perl script with the code for filtering VCF and HapMap-formatted data files is provided as “Supplementary file 2”. Phenotype data used for QTL mapping is provided as “Supplementary File 3”. A Perl script with the code for the simulation study is provided as “Supplementary file 4”.
keywords: HapMapParser; GenotypingSimulator
published: 2024-05-30
 
This repository contains the the data and code to recreate the simulations in "High Costs of GHG Abatement with Electrifying the Light-Duty Vehicle Fleet with Heterogeneous Preferences of Vehicle Consumers." The model can be run by calling the bash file in the SLURM environment with parameters set for different scenarios. BEPEAM-E model details: (1) the "Main.gms" file in GAMS format that contains the initiating stage settings with input and main optimization model (2) the "output.gms" file in GAMS format that prepare the output file from BEPAM model. (3) the rest are the intermediate input files for model to generate the input and output files for the model. (4) Four bash files are the script file that call the GAMS model on the HPC that includes both HPC environment and the scenario settings. Four bash files are uploaded corresponding to 4 scenarios
keywords: BEPAM; Greenhouse Gases; Light-Duty Vehicles; Economics
published: 2024-06-11
 
This dataset contains weather data taken at the University of Illinois Urbana-Champaign Energy Farm using automatic sensors and averaged every 15 minutes. Measurements include average air temperature, average relative humidity, average wind speed, maximum wind speed, average wind direction, average photosynthetically active radiation, total precipitation, and average air pressure.
keywords: air temperature; relative humidity; wind speed; wind direction; photosynthetically active radiation; precipitation; air pressure
published: 2024-05-24
 
This dataset consists the 286 publications retrieved from Web of Science and Scopus on July 6, 2023 as citations for (Willoughby et al., 2014): Willoughby, Patrick H., Jansma, Matthew J., & Hoye, Thomas R. (2014). A guide to small-molecule structure assignment through computation of (¹H and ¹³C) NMR chemical shifts. Nature Protocols, 9(3), Article 3. https://doi.org/10.1038/nprot.2014.042 We added the DOIs of the citing publications into a Zotero collection, which we exported into a .csv file and an .rtf file. Willoughby2014_286citing_publications.csv is a Zotero data export of the citing publications. Willoughby2014_286citing_publications.rtf is a bibliography of the citing publications, using a variation of American Psychological Association style (7th edition) with full names instead of initials. We developed an automation system to analyze unreliability propagation through the publications citing an unreliable publication: Willoughby et al., 2014 (one of the Python scripts that supported the protocol presented in this publication has a code glitch). We call a publication "unreliable by propagation" when its main findings have become unreliable by citing an unreliable source. The system triaged the citing publications that are in English (284) according to whether they are at risk because of citing Willoughby et al., 2014. We excluded 2 publications that are not in English, their DOIs are 10.13220/j.cnki.jipr.2015.06.004 and 10.19540/j.cnki.cjcmm.20200604.201. We compared the accuracy of the system's triage with a separate manual analysis the chemistry expert (YF) conducted on the 284 citing publications. 284_merged_decision_and_annotation.csv (new in this V2) shows the system triage results and the results of a chemistry domain expert (YF)'s manual analysis on the 284 citing publications.
keywords: scientific publications; arguments; citation contexts; defeasible reasoning; Zotero; Web of Science; Scopus; unreliable cited sources; automation systems; knowledge maintenance
published: 2023-08-04
 
Data are provided that are relevant to the rare plant Phlox pilosa ssp. sangamonensis, or Sangamon phlox, and other members of the genus that occur in its native range. Sangamon phlox is a state-endangered subspecies that is only known to occur in two Illinois counties. Data provided come from all known Sangamon phlox populations, which we estimate as 10 separate populations. Data include genetic data from DNA microsatellite loci (allele sizes and basic summaries), flowering population size estimates, rates of fruit set, and rates of seed set. Additionally, genetic data (from microsatellites) are provided for Phlox divaricata ssp. laphamii (three populations), Phlox pilosa ssp. pilosa (two populations), and Phlox pilosa ssp. fulgida (two populations).
keywords: Phlox; conservation genetics; microsatellites; endemism; rare plants
published: 2024-05-30
 
This dataset contains all the datasets used in the study conducted for the research publication titled "Mapping dynamic human sentiments of heat exposure with location-based social media data". This paper develops a cyberGIS framework to analyze and visualize human sentiments of heat exposure dynamically based on near real-time location-based social media (LBSM) data. Large volumes and low-cost LBSM data, together with a content analysis algorithm based on natural language processing are used effectively to generate heat exposure maps from human sentiments on social media. ## What’s inside - A quick explanation of the components of the zip file * US folder includes the shapefile corresponding to the United State with County as spatial unit
 * Census_tract folder includes the shapefile corresponding to the Cook County with census tract as spatial unit * data/data.txt includes instruction to retrieve the sample data either from Keeling or figshare * geo/data20000.txt is the heat dictionary created in this paper, please refer to the corresponding publication to see the data creation process Jupyter notebook and code attached to this publication can be found at: https://github.com/cybergis/real_time_heat_exposure_with_LBSMD
keywords: CyberGIS; Heat Exposure; Location-based Social Media Data; Urban Heat
published: 2024-05-29
 
Data from manuscript Atomic-Scale Visualization of a Cascade of Magnetic Orders in the Layered Antiferromagnet GdTe3, to be published in npj Quantum Materials. Powerpoint file has details on how the data can be opened and how the data are labeled.
keywords: Scanning Tunneling Microscopy; Physics; GdTe3; Rare-Earth Tritellurides
published: 2024-05-07
 
Optical, AFM, and PFM image of α-In2Se3; Short-circuit current and open circuit voltage maps, I-V curve for different intensities; Dependence of the short-circuit current density, open-circuit voltage, depolarization field, and efficiency on intensity and thickness; Benchmarking the performance.
published: 2024-02-16
 
This dataset contains five files. (i) open_citations_jan2024_pub_ids.csv.gz, open_citations_jan2024_iid_el.csv.gz, open_citations_jan2024_el.csv.gz, and open_citation_jan2024_pubs.csv.gz represent a conversion of Open Citations to an edge list using integer ids assigned by us. The integer ids can be mapped to omids, pmids, and dois using the open_citation_jan2024_pubs.csv and open_citations_jan2024_pub_ids.scv files. The network consists of 121,052,490 nodes and 1,962,840,983 edges. Code for generating these data can be found https://github.com/chackoge/ERNIE_Plus/tree/master/OpenCitations. (ii) The fifth file, baseline2024.csv.gz, provides information about the metadata of PubMed papers. A 2024 version of PubMed was downloaded using Entrez and parsed into a table restricted to records that contain a pmid, a doi, and has a title and an abstract. A value of 1 in columns indicates that the information exists in metadata and a zero indicates otherwise. Code for generating this data: https://github.com/illinois-or-research-analytics/pubmed_etl. If you use these data or code in your work, please cite https://doi.org/10.13012/B2IDB-5216575_V1.
keywords: PubMed
published: 2024-05-23
 
This dataset contains the training results (model parameters, outputs), datasets for generalization testing, and 2-D implementation used in the article "Learned 1-D passive scalar advection to accelerate chemical transport modeling: a case study with GEOS-FP horizontal wind fields." The article will be submitted to Artificial Intelligence for Earth Systems. The datasets are saved as CSV for 1-D time-series data and *netCDF for 2-D time series dataset. The model parameters are saved in every training epoch tested in the study.
keywords: Air quality modeling; Coarse-graining; GEOS-Chem; Numerical advection; Physics-informed machine learning; Transport operator
published: 2024-05-23
 
This dataset consists of all the figure files that are part of the main text and supplementary of the manuscript titled "Optical manipulation of the charge density wave state in RbV3Sb5". For detailed information on the individual files refer to the readme file.
keywords: kagome superconductor; optics; charge density wave
published: 2024-03-21
 
Impact assessment is an evolving area of research that aims at measuring and predicting the potential effects of projects or programs. Measuring the impact of scientific research is a vibrant subdomain, closely intertwined with impact assessment. A recurring obstacle pertains to the absence of an efficient framework which can facilitate the analysis of lengthy reports and text labeling. To address this issue, we propose a framework for automatically assessing the impact of scientific research projects by identifying pertinent sections in project reports that indicate the potential impacts. We leverage a mixed-method approach, combining manual annotations with supervised machine learning, to extract these passages from project reports. This is a repository to save datasets and codes related to this project. Please read and cite the following paper if you would like to use the data: Becker M., Han K., Werthmann A., Rezapour R., Lee H., Diesner J., and Witt A. (2024). Detecting Impact Relevant Sections in Scientific Research. The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING). This folder contains the following files: evaluation_20220927.ods: Annotated German passages (Artificial Intelligence, Linguistics, and Music) - training data annotated_data.big_set.corrected.txt: Annotated German passages (Mobility) - training data incl_translation_all.csv: Annotated English passages (Artificial Intelligence, Linguistics, and Music) - training data incl_translation_mobility.csv: Annotated German passages (Mobility) - training data ttparagraph_addmob.txt: German corpus (unannotated passages) model_result_extraction.csv: Extracted impact-relevant passages from the German corpus based on the model we trained rf_model.joblib: The random forest model we trained to extract impact-relevant passages Data processing codes can be found at: https://github.com/khan1792/texttransfer
keywords: impact detection; project reports; annotation; mixed-methods; machine learning
published: 2024-04-18
 
Data: Variation in pesticide toxicity in the western honey bee (Apis mellifera) associated with consuming phytochemically different monofloral honeys Includes: Identification and quantification of phenolic components of honeys: Raw_data_JOCE.xlsx – sheet: “HoneyPhytochemicals” Effects of honey phytochemicals on acute pesticide toxicity: Raw_data_JOCE.xlsx – sheet: “raw_LD50 Raw_data_JOCE.xlsx – sheet: “raw_LD50_hive_based”
keywords: Honey; honey bee; phenolic acid; flavonoids; bifenthrin; LD50
published: 2023-07-14
 
This dataset includes a total of 300 images of 45 extant species of Podocarpus (Podocarpaceae) and nine images of fossil specimens of the morphogenus Podocarpidites. The goal of this dataset is to capture the diversity of morphology within the genus and create an image database for training machine learning models. The images were taken using Airyscan confocal superresolution microscopy at 630x magnification (63x/NA 1.4 oil DIC). The images are in the CZI file format. They can be opened using Zeiss propriety software (Zen, Zen lite) or open microscopy software, such as ImageJ. More information on how to open CZI files can be found here: [https://www.zeiss.com/microscopy/us/products/microscope-software/zen/czi.html#microscope---image-data]
keywords: superresolution microscopy; Zeiss Airyscan; CZI images; conifer; saccate pollen
published: 2019-11-11
 
This repository includes scripts and datasets for the paper, "FastMulRFS: Fast and accurate species tree estimation under generic gene duplication and loss models." Note: The results from estimating species trees with ASTRID-multi (included in this repository) are *not* included in the FastMulRFS paper. We estimated species trees with ASTRID-multi in the fall of 2019, but ASTRID-multi had an important bug fix in January 2020. Therefore, the ASTRID-multi species trees in this repository should be ignored.
keywords: Species tree estimation; gene duplication and loss; statistical consistency; MulRF, FastRFS
published: 2020-09-07
 
This dataset contains BEPAM model code and input data to the replicate the results for "Assessing the Returns to Land and Greenhouse Gas Savings from Producing Energy Crops on Conservation Reserve Program Land." The dataset consists of: (1) The replication codes and data for the BEPAM model. The code file is named as output_0213-2020_Complete_daycent-agversion-[rental payment level]%_[biomass price].gms. (BEPAM-CRP model-Sep2020.zip) (2) Simulation results from the BEPAM model (BEPAM_Simulation_Results.csv) * Item (1) is in GAMS format. Item (2) is in text format.
keywords: Miscanthus; Switchgrass; soil carbon sequestration; greenhouse gas savings; rental payments; biomass price
published: 2021-03-05
 
Datasets that accompany Beilke, Blakey, and O'Keefe 2021 publication (Title: Bats partition activity in space and time in a large, heterogeneous landscape; Journal: Ecology and Evolution).
keywords: spatiotemporal; chiroptera
published: 2021-04-18
 
This dataset contains all the code, notebooks, datasets used in the study conducted for the research publication titled "Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19 Data". Specifically, this package include the artifacts used to conduct spatial-temporal analysis with space time kernel density estimation (STKDE) using COVID-19 data, which should help readers to reproduce some of the analysis and learn about the methods that were conducted in the associated book chapter. ## What’s inside - A quick explanation of the components of the zip file * Multi-scale CyberGIS Analytics for Detecting Spatiotemporal Patterns of COVID-19.ipynb is a jupyter notebook for this project. It contains codes for preprocessing, space time kernel density estimation, postprocessing, and visualization. * data is a folder containing all data needed for the notebook * data/county.txt: US counties information and fip code from Natural Resources Conservation Service. * data/us-counties.txt: County-level COVID-19 data collected from New York Times COVID-19 github repository on August 9th, 2020. * data/covid_death.txt: COVID-19 death information derived after preprocessing step, preparing the input data for STKDE. Each record is if the following format (fips, spatial_x, spatial_y, date, number of death ). * data/stkdefinal.txt: result obtained by conducting STKDE. * wolfram_mathmatica is a folder for 3D visulization code. * wolfram_mathmatica/Visualization.nb: code for visulization of STKDE result via weolfram mathmatica. * img is a folder for figures. * img/above.png: result of 3-D visulization result, above view. * img/side.png: result of 3-D visulization, side view.
keywords: CyberGIS; COVID-19; Space-time kernel density estimation; Spatiotemporal patterns
published: 2021-05-13
 
Data files and R code to replicate the econometric analysis in the journal article: B Chen, BM Gramig and SD Yun. “Conservation Tillage Mitigates Drought Induced Soybean Yield Losses in the US Corn Belt.” Q Open. https://doi.org/10.1093/qopen/qoab007
keywords: R, Conservation Tillage, Drought, Yield, Corn, Soybeans, Resilience, Climate Change