Illinois Data Bank Dataset Search Results
Results
published:
2026-05-29
Favela, Alonso; Bohn, Martin; Kent, Angela
(2026)
Nitrogenous fertilizers provide a short-lived benefit to crops in agroecosystems, but stimulate nitrification and denitrification, processes that result in nitrate pollution, N2O production, and reduced soil fertility. Recent advances in plant microbiome science suggest that genetic variation in plants can modulate the composition and activity of rhizosphere N-cycling microorganisms. Here we attempted to determine whether genetic variation exists in Zea mays for the ability to influence the rhizosphere nitrifier and denitrifier microbiome under “real-world” conventional agricultural conditions.
To capture an extensive amount of genetic diversity within maize we grew and sampled the rhizosphere microbiome of a diversity panel of germplasm that included ex-PVP inbreds (Z. mays ssp. mays), ex-PVP hybrids (Z. mays ssp. mays), and teosinte (Z. mays ssp. mexicana and Z. mays ssp. parviglumis). From these samples, we characterized the microbiome, a suite of microbial genes involved in nitrification and denitrification and carried out N-cycling potential assays.
Here we are showing that populations/genotypes of a single species can vary in their ecological interaction with denitrifers and nitrifers. Some hybrid and teosinte genotypes supported microbial communities with lower potential nitrification and potential denitrification activity in the rhizosphere, while inbred genotypes stimulated/did not inhibit these N-cycling activities. These potential differences translated to functional differences in N2O fluxes, with teosinte plots producing less GHG than maize plots.
Taken together, these results suggest that Zea genetic variation can lead to changes in N-cycling processes that result in N leaching and N2O production, and thereby are selectable targets for crop improvement. Understanding the underlying genetic variation contributing to belowground microbiome N-cycling into our conventional agricultural system could be useful for sustainability.
keywords:
Nitrogen; Plant-Soil Microbiome; Soil
published:
2023-08-03
Dalling, James William
(2023)
This file contains the delta 15N values for leaf material collected from Cyathea rojasiana tree ferns before and after fertilization using ammonium -15N chloride solution to determine whether 15N update is possible from senescent leaves.
Details of the experiment are provided in the online supplement to the published paper. Briefly, In February 2022 we selected three mature C. rojasiana individuals 1-1.5m in height that had leaves rooted in the soil and one new developing (but unexpanded) leaf. For each fern, two plastic pots (10 x 10 x 12 cm) were filled with a 50:50 mixture of washed river sand and soil from the Chorro watershed. For each pot, one senescent leaf that was rooted in the soil was carefully excavated and its roots transplanted into the pot. Pots were then fertilized by adding 30 ml of a 0.02 M 15N solution of ammonium-15N chloride (98% 15N; Sigma-Aldrich 299251; St Louis, MO) to yield a target concentration of 2 µg15N cm-3 of soil. After fertilization pots were carefully enclosed within thick plastic bags, and sealed around the senescent leaf rachis to prevent leaching any of 15N from the pot to the surrounding soil.
At the time of N fertilization, pinnae of the youngest fully expanded leaf were collected from each fern. One pinna was collected from the base of the leaf and one from the distal end of the leaf. In March 2022, after 28 days the roots were removed from pots and two additional leaf pinnae sampled from each fern: one from the base and one from the distal end of the youngest (now fully expanded) leaf. Leaf samples were dried for 72 hours at 60 C and then leaf lamina tissue finely ground with a bead beater. The delta 15N for each leaf sample determined at the University of Illinois, Urbana-Champaign using a Thermo Delta V Advantage IRMS run in combination with a Costech 4010 Elemental Analyzer. Samples were run in continuous flow relative to laboratory standards that were calibrated with USGS 40, 41, and NBS 19 reference materials.
keywords:
15N; Cyathea rojasiana; N fertilization; montane forest
published:
2026-05-27
Zhang, Zhengyi; Li, Maolin; Harrison, Wesley; Lu, Jingxia; Zhao, Zhenxiang; Yuan, Yujie; Zhao, Huimin
(2026)
Producing enantioenriched molecules from racemic mixtures is essential for manufacturing. Traditional methods such as resolution, deracemization and enantioconvergent catalysis primarily involve separating or converting enantiomers without altering their structures, or functionalization of stereocentres at or proximal to functional groups. However, there are challenges in enantioselectively forging C–H bonds that are remote from functional groups via hydrogen atom transfer (HAT) with these methods. Here we introduce a strategy for the photoenzymatic stereoablative enantioconvergence of γ-chiral oximes using repurposed flavin-dependent ene-reductases. A photoinduced single-electron reduction of the γ-chiral oxime by an ene-reductase generates an iminyl radical, which then undergoes stereoablative 1,5-HAT at the γ-stereocentre. Subsequent chiral reconstruction through enzymatic HAT and spontaneous imine hydrolysis yields the γ-chiral ketone with high enantioselectivity. This work provides a robust method for remote stereoablative enantioconvergent HAT and broadens the synthetic utility of photobiocatalysis.
keywords:
Bioproducts; Catalysis
published:
2026-05-27
Whitten Harris, Andrya L.; Harris, Brandon S.; Spear, Michael J.; Metzke, Brian A.; Taylor, Christopher A.; Lamer, James T.
(2026)
This dataset contains Northern Sunfish (Lepomis peltastes) catch record data from Multi-Agency Monitoring dataset from the Illinois Waterway (Lockport Pool-Alton Pool) Illinois, USA from 2019-2024. These data are associated with a paper accepted for publication in Northeastern Naturalist in May 2026 entitled "Distribution, abundance, and detection frequency of Lepomis peltastes Cope (Northern Sunfish) in the Illinois Waterway, Illinois USA."
These data are in a CSV format. There are seven data columns: year, pool (LP: Lockport Pool, BN: Brandon Road Pool, DR: Dresden Island Pool, MA: Marseilles Pool, ST: Starved Rock Pool, PE: Peoria Pool, LG: La Grange Pool, and AL: Alton Pool), gear (D: daytime boat electrofishing, F: regular fyke net, HS: small hoop net, and M: mini fyke net), coordinate_north_south (latitude), coordinate_east_west (longitude), habitat (MCB: main-channel boarder, SCB: side-channel boarder, and BWC: fully connected backwater), and catch (number of Norther Sunfish caught at that location). These data were analyzed using R Statistical Software (version 4.4.2; R Core Team 2024). See Readme file for a more detailed description of dataset and dataset variables.
keywords:
Northern Sunfish; Lepomis peltastes; Illinois Waterway
published:
2026-05-27
London, Evan; Mateus-Pinilla, Nohra
(2026)
Sequences from the PRNP coding region of wild white-tailed deer along with chronic wasting disease status. Animals were harvested from 22 Northern Illinois counties between 2002 and 2022.
keywords:
Cervid; transmissible spongiform encephalopathy; wildlife epidemiology; deer; CWD management; CWD surveillance; Odocoileus virginianus
published:
2025-09-11
Ng, Yee Man Margaret; Goncalves, Alexandre
(2025)
We present a three-year archival, longitudinal dataset of YouTube Trending videos, collected from July 1, 2022, to June 30, 2025, four retrieval per day. This collection, a unique historical record of digital culture in transition, includes 446,971 snapshots from 104 countries, encompassing 726,627 unique videos and their associated metadata. Each record includes collection timestamp, geographic region, video ranking, core identifiers (video ID, channel ID, category), content metadata (title, description, tags, localization), language information, live status, view and comment counts. Full documentation: https://arxiv.org/abs/2510.23645
Unlike previous datasets with limited geographic scope or short timeframes, our data offers exceptional coverage for cross-national and longitudinal analyses of digital culture. This non-personalized data corpus provides an irreplaceable baseline for understanding crisis communication, platform governance or temporal shifts in content popularity.
Publication: Goncalves, A., & Ng, Y. M. M. (2026). Global YouTube Trending Dataset (2022–2025): Three Years of Platform-Curated, Cross-National Trends in Digital Culture. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 2817–2827. https://doi.org/10.1609/icwsm.v20i1.42784
keywords:
YouTube; Trending Videos; Digital Culture; Global Trend
published:
2026-05-22
Huang, Hsuan-Kai; Kuo, Joseph; Zhou, Bozhen; Park, Seonyeong; Villa, Umberto; Anastasio, Mark
(2026)
This dataset is a companion dataset to the manuscript:
Hsuan-Kai Huang, Joseph Kuo, Seonyeong Park, Umberto Villa, Lihong V. Wang, Mark A. Anastasio, "Stochastic numerical head phantoms to enable virtual imaging studies of transcranial photoacoustic computed tomography," arXiv: arXiv:2510.09758 (2025) <a href="https://doi.org/10.48550/arXiv.2510.09758">https://doi.org/10.48550/arXiv.2510.09758</a>
The dataset contains 50 sets of three-dimensional (3D) numerical head phantoms (NHPs) for use in virtual imaging studies of transcranial photoacoustic computed tomography (t-PACT), along with the corresponding simulated optical fluence distributions, induced initial pressure distributions, and t-PACT measurement data.
More detailed information is provided in the accompanying README.txt file.
keywords:
Virtual imaging; In silico imaging; Numerical head phantoms; Optoacoustic tomography; Photoacoustic computed tomography; Transcranial imaging
published:
2026-05-22
Conroy, Nicholas S.; Gammie, Charles F.
(2026)
Visibility Amplitude Pattern Speeds from the v3 Sgr A* Library. Data for "Event Horizon Telescope Pattern Speeds in the Visibility Domain” (Conroy et al.). Data are provided in 2 file formats: a TXT table which is a standard format for the Astrophysical Journal (ApJ) where the paper is submitted and the original NPY format.
published:
2026-05-21
Kim, Sang Yeol; Slattery, Rebecca; Ort, Donald
(2026)
Rubisco activase (Rca) facilitates the release of sugar-phosphate inhibitors at Rubisco catalytic sites during CO2 fixation. Most plant species express two Rca isoforms, the larger Rca-α and the shorter Rca-β, either by alternative splicing from a single gene or expression from separate genes. The mechanism of Rubisco activation by Rca isoforms has been intensively studied in C3 plants. However, the functional role of Rca in C4 plants where Rubisco and Rca are located in a much higher [CO2] compartment is less clear. In this study, we selected four C4 bioenergy grasses and the model C4 grass setaria (Setaria viridis) to investigate the role of Rca in C4 photosynthesis. All five C4 grass species contained two Rca genes, one encoding Rca-α and the other Rca-β, which were positioned closely together in the genomes. A variety of abiotic stress-related motifs were identified in the Rca-α promoter of each grass, and while the Rca-β gene was constantly highly expressed at ambient temperature, Rca-α isoforms were expressed only at high temperature but never surpassed 30% of Rca-β content. The pattern of Rca-α induction on transition to high temperature and reduction on return to ambient temperature was the same in all five C4 grasses. In sorghum (Sorghum bicolor), sugarcane (Saccharum officinarum), and setaria, the induction rate of Rca-α was similar to the recovery rate of photosynthesis and Rubisco activation at high temperature. This association between Rca-α isoform expression and maintenance of Rubisco activation at high temperature suggests that Rca-α has a functional thermo-protective role in carbon fixation in C4 grasses by sustaining Rubisco activation at high temperature.
keywords:
Genomics
published:
2026-05-19
Park, Kiyoul; Quach, Truyen; Clark, Teresa; Kim, Hyojin; Zhang, Tieling; Wang, Mengyuan (Mary); Guo, Ming; Sato, Shirley; Nazarenus, Tara; Blume, Rostislav; Blume, Yaroslav; Zhang, Chi; Moose, Stephen; Swaminathan, Kankshita; Schwender, Jorg; Clemente, Thomas; Cahoon, Edgar
(2026)
Biomass crops engineered to accumulate energy-dense triacylglycerols (TAG or ‘vegetable oils’) in their vegetative tissues have emerged as potential feedstocks to meet the growing demand for renewable diesel and sustainable aviation fuel (SAF). Unlike oil palm and oilseed crops, the current commercial sources of TAG, vegetative tissues, such as leaves and stems, only transiently accumulate TAG. In this report, we used grain (Texas430 or TX430) and sugar-accumulating ‘sweet’ (Ramada) genotypes of sorghum, a high-yielding, environmentally resilient biomass crop, to accumulate TAG in leaves and stems. We initially tested several gene combinations for a ‘push-pull-protect’ strategy. The top TAG-yielding constructs contained five oil transgenes for a sorghum WRINKLED1 transcription factor (‘push’), a Cuphea viscosissima diacylglycerol acyltransferase (DGAT; ‘pull’), a modified sesame oleosin (‘protect’) and two combinations of specialized Cuphea lysophosphatidic acid acyltransferases and medium-chain acyl-acyl carrier protein thioesterases. Though intended to generate oils with medium-chain fatty acids, engineered lines accumulated oleic acid-rich oil to amounts of up to 2.5% DW in leaves and 2.0% DW in stems in the greenhouse, 36-fold and 49-fold increases relative to wild-type (WT) plants, respectively. Under field conditions, the top-performing event accumulated TAG to amount to 5.5% DW in leaves and 3.5% DW in stems, 78-fold and 58-fold increases, respectively, relative to WT TX430. Transcriptomic and fluxomic analyses revealed potential bottlenecks for increased TAG accumulation. Overall, our studies highlight the utility of a lab-to-field pipeline coupled with systems biology studies to deliver high vegetative oil sorghum for SAF and renewable diesel production.
keywords:
Biofuels; Lipids; Sorghum; Sustainable Aviation Fuel; Vegetative Oils
published:
2026-05-19
Williams, Dajanae A.; Davis, Mark A.; Douglass, Sarah A.; Hartman, Jordan H.; Kath, Joseph A.; Palmer, Savanna; Larson, Eric R.
(2026)
Environmental DNA (eDNA) data for mudpuppy (Necturus maculosus) and salamander mussels (Simpsonaias ambigua) collected at stream sites in the Sangamon River watershed of central Illinois, United States from 2024 to 2025, used to optimize environmental DNA sampling for these species over time and space.
keywords:
Illinois; eDNA; mudpuppy; salamander mussel; Necturus maculosus; Simpsonaias ambigua; Sangamon River
published:
2026-05-19
This is species specific scavenger data documented at puma kills in the Santa Cruz Mountains, relating to the manuscript:
Allen, M.L., A.T.L. Allan, R.M. King, B.H. Warner, J.J. Morgan, and C.C. Wilmers. 2026. Scavenger assemblage behavior at puma kills in the Santa Cruz Mountains, California. Ecology and Evolution.
keywords:
Santa Cruz Mountains; Scavenger Assemblage; Community Ecology
published:
2020-12-07
Tian, Yuan; Smith-Bolton, Rachel
(2020)
This page contains the data for the publication "Regulation of growth and cell fate during tissue regeneration by the two SWI/SNF chromatin-remodeling complexes of Drosophila" published in Genetics, 2020
published:
2026-05-18
Thayer, Elizabeth; Brooke, Christopher
(2026)
These are images and associated statistics of A549 cells treated with pIC. They are stained with DAPI for nucleus detection and IRF3.
published:
2026-05-14
95, 150, and 220 GHz light curves and thumbnails of 828 symbiotic star candidates from the New Online Database of Symbiotic Variables that are within the SPT-3G and ACT DR6 footprints.
published:
2026-05-14
Yook, Sangdo; Deewan, Anshu; Ziolkowski, Leah; Lane, Stephan; Tohidifar, Payman; Cheng, Ming-Hsun; Singh, Vijay; Stasiewicz, Matthew Jon; Rao, Christopher; Jin, Yong-Su
(2026)
Yarrowia lipolytica, an oleaginous yeast, shows promise for industrial fermentation due to its robust acetyl-CoA flux and well-developed genetic engineering tools. However, its lack of an active xylose metabolism restricts the conversion of cellulosic sugars to valuable products. To address this, metabolic engineering, and adaptive laboratory evolution (ALE) were applied to the Y. lipolytica PO1f strain, resulting in an efficient xylose-assimilating strain (XEV). Whole-genome sequencing (WGS) of the XEV followed by reverse engineering revealed that the amplification of the heterologous oxidoreductase pathway and a mutation in the GTPase-activating protein gene (YALI0B12100g) might be the primary reasons for improved xylose assimilation in the XEV strain. When a sorghum hydrolysate was used, the XEV strain showed superior xylose consumption and lipid production compared to its parental strain (X123). This study advances our understanding of xylose metabolism in Y. lipolytica and proposes effective metabolic engineering strategies for optimizing lignocellulosic hydrolysates.
keywords:
Hydrolysate; Lipids; Metabolic Engineering
published:
2026-05-14
Bloomer, Caitlin Claire; Adams, Susan; Barnett, Zanethia; Graham, Zackary; Delekta, Emmy; Hayes, David; Loughman, Zachary; MacIntosh, Hugh; Pugh, M. Worth; Reed, Karen; Shoobs, Nathaniel; Taylor, Christopher; Larson, Eric
(2026)
This dataset compiles records of 60 of the largest documented crayfish specimens from multiple institutional collections across North America. It includes standardized measurements of body size (carapace length in millimeters), collection year, and generalized geographic locality, along with institutional identifiers and catalog numbers that enable traceability to physical specimens. By aggregating extreme size records across taxa and regions, the dataset is designed to support analyses of maximum body size limits, geographic patterns in size distribution, and historical collection trends. It may also serve as a reference for comparative morphological studies, validation of specimen records, and future investigations into ecological or physiological constraints on crayfish growth.
keywords:
Crayfish; body size; carapace length; museum collections; geographic distribution; morphometrics
published:
2026-04-29
Park, Seonyeong; Jeong, Gangwon; Villa, Umberto; Anastasio, Mark
(2026)
This dataset is a subset of a companion dataset to the manuscript:
Seonyeong Park, Gangwon Jeong, Umberto Villa, Mark A. Anastasio, "A Virtual Imaging Framework for Three-Dimensional Quantitative Optoacoustic Tomography Using Stochastic Numerical Breast Phantoms," arXiv preprint arXiv:2510.00189 (2025) <a href="https://doi.org/10.48550/arXiv.2510.00189">https://doi.org/10.48550/arXiv.2510.00189</a>
This subset was specifically used in the following publication:
Refik Mert Cam, Seonyeong Park, Umberto Villa, Mark A. Anastasio, "Application of a Virtual Imaging Framework for Investigating a Deep Learning-based Reconstruction Method for 3D Quantitative Photoacoustic Computed Tomography," Photoacoustics 100792 (2025) <a href="https://doi.org/10.1016/j.pacs.2025.100792">https://doi.org/10.1016/j.pacs.2025.100792</a>
The dataset contains 64 sets of three-dimensional (3D) numerical breast phantoms (NBPs) for use in virtual imaging studies of optoacoustic tomography (OAT), along with the corresponding simulated multi-wavelength optical fluence distributions, induced initial pressure distributions, and OAT measurement data. Each set corresponds to a distinct breast anatomy and includes four anatomy-matched variants: (i) a healthy breast with Fitzpatrick skin tone 1, and (ii-iv) lesion-inserted breasts with Fitzpatrick skin tones 1, 3, and 5.
More detailed information is provided in the accompanying README.txt file.
keywords:
Virtual imaging; In silico imaging; Numerical breast phantoms; Optoacoustic tomography; Photoacoustic computed tomography; Breast imaging
published:
2026-05-08
Stewart, Dalton; Guo, Wenjun; Li, Yalin; Fan, Xinxin; Coppess, Jonathan; Khanna, Madhu; Guest, Jeremy
(2026)
Low carbon fuel policies such as the U.S. Renewable Fuel Standard (RFS), Canada Clean Fuel Regulations (CFR), and California Low Carbon Fuel Standard (LCFS) as well as the 45Z tax credit are intended to reduce greenhouse gas (GHG) emissions from transportation. Cellulosic feedstocks, optimized biorefineries, and favorable farming locations can significantly reduce biofuel carbon intensity (CI). Despite advances in field-to-fuel GHG monitoring and flexibility in resource allocation within biorefineries (e.g., governing net electricity production), rigid CI accounting procedures in current policies may limit CI responsiveness across candidate sites and processing facilities. This work examines a hypothetical biomass-to-sustainable aviation fuel (SAF) pathway using miscanthus and alcohol-to-jet (i) to demonstrate how GHG accounting requirements drive estimates of biofuel CIs and (ii) to explore potential CI and financial implications of scenario-specific life cycle assessment (LCA). Results demonstrate that GHG accounting using the CFR/LCFS can reasonably account for distinct levels of net electricity production by a biorefinery, but only the CFR yields similar CI sensitivity to spatially explicit factors (feedstock CI, grid electricity CI) as scenario-specific LCA: most GHG accounting frameworks do not capture CI variation across candidate sites in the United States. Ultimately, this work demonstrates the importance of LCA methodological specifications in low carbon fuel policies and tax credits.
keywords:
Miscanthus; Policy; Sustainable Aviation Fuel
published:
2026-05-08
Saha, Subhrangsu; Moore, Bruce J.; Manaugh, Ben; Roesler, Jeffery; Lopez-Pamies, Oscar
(2026)
This dataset accompanies the research paper "A Guide to Fully Characterize the Fracture Properties of Cementitious Materials from Simple Experiments" by Subhrangsu Saha, Bruce J. Moore, Ben Manaugh, Jeffery R. Roesler, and Oscar Lopez-Pamies.
The dataset contains experimental data for specimens made of mortar. The files contain experimental results from the following tests:
1. Brazilian fracture test with flat platen (Brazilian.csv): Contains u (Platen displacement) vs P (Load) from 4 experiments.
2. Wedge split test (Wedge-split.csv): Contains uh (applied horizontal displacement read from front and back) vs Pv (Vertical load) from 3 experiments.
3. Four-point unnotched bending test (4-point-unnotched.csv): Contains u (applied displacement) vs S (nominal maximum stress) from 4 experiments.
4. Three-point unnotched bending test (3-point-unnotched.csv): Contains u (applied displacement) vs S (nominal maximum stress) from 4 experiments.
5. Three-point notched bending test (3-point-notched.csv): Contains u (applied displacement) vs P (applied load) from 3 experiments.
keywords:
Fracture nucleation; Strength; Phase-field regularization; Mortar; Concrete
published:
2026-05-07
Park, Minhyuk; Chacko, George
(2026)
This network is a curated version of a network created by harvesting citing and cited articles around Whitman et al. (1988) Nature, 332(6165):644–646. For further details refer to <a href="https://databank.illinois.edu/datasets/IDB-4897629">https://databank.illinois.edu/datasets/IDB-4897629</a>. Curation was performed by removing nodes (articles identified by Dimensions publication ids) whose year or DOI record was missing from the Dimensions database and retaining the largest connected component of the resulting network. This curated network represents the largest connected component. Integer ids were generated by the authors to replace the Dimensions ids. Access to the raw data requires a license from Digital Science.
The original pi3k network contains 17,970,340 nodes of which only 17,508,111 (97.42%) them have both year and DOI information. In this curated version, 127,255,020 edges were reduced to 125,118,817 edges (98.32%). The edges are represented with two columns in the file where the "source_iid" column represents the citing node and "target_iid" column represents the cited node. Restricting the original pi3k network to only those nodes with both year and DOI information results in a graph that has 21,469 connected components where the largest connected component has 17,486,619 nodes (97.31%) . Thus, this network represents 97.31% of the nodes and 98.32% of the edges in the original network. The authors thank Digital Science for supporting this project through access to the Dimensions database.
keywords:
pi3k citation network;
published:
2026-05-06
Haas, Benjamin; Saif, Faaiza; Doran, Lynn; Burgess, Steven; Long, Stephen
(2026)
Scripts for the manuscript "A fluorescence-based transient expression assay for the analysis of upstream open reading frames in plant" by Haas et al.
Upstream open reading frames (uORFs) are regulatory elements present in the 5′ leaders of mRNA that can significantly impact downstream gene expression in eukaryotes. In crop engineering, editing of uORFs can provide an avenue to upregulate expression of native genes without the need to add persistent transgenic copies. Even with genome- wide methods to identify translated uORFs such as ribosome profiling, their functional characterization depends on validation through reporter gene assays and mutagenesis studies. Current screening methods for plants use luciferases or protoplasts to measure differential gene expression between wild- type and mutated transcript leaders, which requires tissue processing and/or substrate addition. Here, we present a time- and cost- efficient alternative to investigate transcript leaders by co- expression of two fluorescent proteins in Nicotiana benthamiana leaf tissue and test our assay on genes involved in photoprotection, editing of which could provide a pathway to increase CO2 assimilation during sun–shade transitions.
keywords:
Gene Editing; Photosynthesis; Plant Transformation; Transient Expression
published:
2026-05-06
Park, Minhyuk; Yi, Haotian; Chen, Ian; Warnow, Tandy; Chacko, George
(2026)
The dataset contains sample data from those generated for the manuscript "Modeling citations and cartels" by Park et al. (2026), who describe the use of the SASCA-ReSA agent-based model to simulate the growth of citation networks and mimic citation cartels through simulations. The manuscript is presently under review. SASCA-ReSA s the latest stage in a series of progressively complex models of citation dynamics (Chacko et al. 2026 Applied Network Science, Park et al 2025 Proceedings of the XIV International Conference on Complex Networks and their Applications , Park et al 2025 MetaRoR). The model is implemented for high performance computing environments and all the results were generated on the Illinois Campus Cluster. The standard simulation reported in this manuscript results in roughly 1.2M nodes. The input to a simulation is a seed network, a configuration file, and real-world distributions for number of references made per article, and the count of authors per article. The output of a simulation is a larger citation network that includes the input network. Details of the model are described in the manuscript and instructions on how to use the software are available on the SASCA-ReSA GIthub site. We have included annotated nodelists from three different simulations.
<b>a) bsl1 (bsl1.csv.tar.xz):</b> has 1,193,102 rows, output of a standard simulation.
<b>b) p5_1 (p5_1.csv.tar.xz):</b> has 1,193,102 rows, output of a standard simulation with 5 agents "planted" in year 1 of the simulation.
<b>c) ps5_1 (ps5_1.csv.tar.xz):</b> has 1,193,102 rows, output of a standard simulation with one agent planted in each of the first five years of a simulation.
<b>d) sample_config.ini:</b> contains configuration parameters for a simulation
<b>e) louvain.parquet.gz:</b> has 160,714,032 rows, with two columns: node_id, and cluster_id with header row data representing a louvain clustering of the ABM161 network (<a href "https://doi.org/10.13012/B2IDB-9265079_V1">https://doi.org/10.13012/B2IDB-9265079_V1</a>). Generated using the louvain module from through kuzu and compressed using to_parquet module of pandas with gzip internal compression. The largest cluster (cluster id 5) has 81,675,241 nodes. This network was generated under the SASCA-ReS model.
keywords:
citation dynamics; agent-based models
published:
2026-05-05
Lin, Xiaoying; Kim, Chansong; Vo, Thi; Waltmann, Tommy; Liu, Haihua; Lu, Jun; Li, Jiahui; Liu, Yu-Shen; Kannur, Suraj; Lee, Junseo; Hwang, Chu-Yun; Kalutantirige, Falon C.; Yao, Lehan; Kotov, Nicholas A.; Glotzer, Sharon; Chen, Qian
(2026)
This dataset contains the raw transmission electron microscopy (TEM) and scanning electron microscopy (SEM) images used in the main figures of the paper “The Importance of Nano-edges in Atomic Stencilling and Chiroptically Active Assembly of Patchy Gold Tetrahedra (2026).” All the images were acquired at the Materials Research Laboratory, University of Illinois at Urbana-Champaign, by Qian Chen group.
1. We provide five subfolders, each named according to the corresponding figure numbers in the paper.
2. All files in the subfolders for Figures 1–3 and 5 are named as "Panel [letter]_*", where [letter] (e.g., a, b, c) represents the raw images used for the corresponding panels.
3. All files in the subfolder for Figure 4 correspond to panel f and show the configurations of patchy tetrahedra synthesized at varying concentrations of iodide and 2-naphthalenethiol. They are named "Experiment_[number]", where [number] represents the corresponding data points in the phase diagram.
4. In TEM images, the bright and dark regions indicate the polymer patches and nanoparticle cores, respectively.
5. In SEM images, the bright and dark regions indicate the nanoparticle cores and polymer patches, respectively.
6. Abbreviations in file names: HAADF-STEM (high-angle annular dark-field scanning transmission electron microscopy), PINEM (photon-induced near-field electron microscopy), and RCP/LCP (left-/right-handed circularly polarized).
keywords:
Patchy nanoparticle; polymer; synthesis; self-assembly; chirality
published:
2026-04-30
Mitchell, Cheyenne; Dhruva, Dhananjay; Burke, Zachary; Durden, David; Dingilian, Armine; Backlund, Mikael
(2026)
Raw and analyzed data, analysis code for "Quantum-inspired super-resolution of fluorescent point-like sources" (Nature Communications, accepted, 2026).