Illinois Data Bank

Data for A Generalized Platform for Artificial Intelligence-powered Autonomous Protein Engineering

Proteins are the molecular machines of life with numerous applications inenergy, health, and sustainability. However, engineering proteins with desiredfunctions for practical applications remains slow, expensive, and specialist-dependent. Here we report a generally applicable platform for autonomousenzyme engineering that integrates machine learning and large languagemodels with biofoundry automation to eliminate the need for human inter-vention, judgement, and domain expertise. Requiring only an input proteinsequence and a quantifiable way to measure fitness, this automated platformcan be applied to engineer a wide array of proteins. As a proof of concept, weengineer Arabidopsis thaliana halide methyltransferase (AtHMT) for a 90-foldimprovement in substrate preference and 16-fold improvement in ethyl-transferase activity, along with developing a Yersinia mollaretii phytase(YmPhytase) variant with 26-fold improvement in activity at neutral pH. This isaccomplished in four rounds over 4 weeks, while requiring construction andcharacterization of fewer than 500 variants for each enzyme. This platform forautonomous experimentation paves the way for rapid advancements acrossdiverse industries, from medicine and biotechnology to renewable energy andsustainable chemistry.

Life Sciences
AI/ML; Automation
CC BY
U.S. Department of Energy (DOE)-Grant:DE-SC0018420
Huimin Zhao
Version DOI Comment Publication Date
1 10.13012/B2IDB-2137211_V1 2026-04-15

7.36 KB File
8.51 KB File
6.64 KB File
8.22 KB File
8.72 KB File
8.49 KB File
2.16 KB File
1.73 KB File
1.19 KB File
3.47 KB File
5.01 KB File
6.01 KB File
3.42 KB File
7.91 KB File
7.05 KB File
4.34 KB File
4.33 KB File
4.34 KB File
4.33 KB File
3.76 MB File
3.42 MB File
660 Bytes File
892 Bytes File
5.21 KB File
5.21 KB File
5.18 KB File
5.18 KB File
13.1 MB File
10.5 KB File
119 KB View File
658 Bytes File
388 Bytes File
581 Bytes File
491 Bytes File
839 Bytes File
1.47 KB File
1.9 KB File

Contact the Research Data Service for help interpreting this log.

Dataset update: {"all_medusa"=>[nil, true]} 2026-04-15T16:40:14Z
Research Data Service Illinois Data Bank
Access and Use Policies Web Privacy Notice Contact Us