Sequence Display enables large-scale sequence–activity datasets for rapid protein evolution

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c
  • Arnold, F. H. Design by directed evolution. Acc. Chem. Res. 31, 125–131 (1998).

    Article 
    CAS 

    Google Scholar 

  • Packer, M. S. & Liu, D. R. Methods for the directed evolution of proteins. Nat. Rev. Genet. 16, 379–394 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Arnold, F. H. Directed evolution: bringing new chemistry to life. Angew. Chem. Int. Ed. Engl. 57, 4143–4148 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Wang, Y. et al. Directed evolution: methodologies and applications. Chem. Rev. 121, 12384–12444 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Yuan, T. et al. Biocatalytic synthesis of N-protected α-amino acids through 1,3-nitrogen migration by nonheme iron enzymes. J. Am. Chem. Soc. 147, 44041–44047 (2025).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Smith, G. P. Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science 228, 1315–1317 (1985).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • McCafferty, J., Griffiths, A. D., Winter, G. & Chiswell, D. J. Phage antibodies: filamentous phage displaying antibody variable domains. Nature 348, 552–554 (1990).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Hu, J. H. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Richter, M. F. et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 38, 883–891 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Blum, T. R. et al. Phage-assisted evolution of botulinum neurotoxin proteases with reprogrammed specificity. Science 371, 803–810 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Doman, J. L. et al. Phage-assisted evolution and protein engineering yield compact, efficient prime editors. Cell 186, 3983–4002.e26 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chin, J. W. et al. An expanded eukaryotic genetic code. Science 301, 964–967 (2003).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Chatterjee, A., Xiao, H. & Schultz, P. G. Evolution of multiple, mutually orthogonal prolyl-tRNA synthetase/tRNA pairs for unnatural amino acid mutagenesis in Escherichia coli. Proc. Natl Acad. Sci. USA 109, 14841–14846 (2012).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chatterjee, A., Xiao, H., Yang, P.-Y., Soundararajan, G. & Schultz, P. G. A tryptophanyl-tRNA synthetase/tRNA pair for unnatural amino acid mutagenesis in E. coli. Angew. Chem. Int. Ed. Engl. 52, 5106–5109 (2013).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Xiao, H., Xuan, W., Shao, S., Liu, T. & Schultz, P. G. Genetic incorporation of ε-N-2-hydroxyisobutyryl-lysine into recombinant histones. ACS Chem. Biol. 10, 1599–1603 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Xiao, H. et al. Exploring the potential impact of an expanded genetic code on protein function. Proc. Natl Acad. Sci. USA 112, 6961–6966 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Xiao, H. & Schultz, P. G. At the interface of chemical and biological synthesis: an expanded genetic code. Cold Spring Harb. Perspect. Biol. 8, a023945 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Fowler, D. M. et al. High-resolution mapping of protein sequence–function relationships. Nat. Methods 7, 741–746 (2010).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Meier, G. et al. Deep mutational scan of a drug efflux pump reveals its structure–function landscape. Nat. Chem. Biol. 19, 440–450 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Raguram, A., An, M., Chen, P. Z. & Liu, D. R. Directed evolution of engineered virus-like particles with improved production and transduction efficiencies. Nat. Biotechnol. 43, 1635–1647 (2025).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Ruffolo, J. A. & Madani, A. Designing proteins with language models. Nat. Biotechnol. 42, 200–202 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Su, J., et al. SaProt: protein language modeling with structure-aware vocabulary. Preprint at bioRxiv https://doi.org/10.1101/2023.10.01.560349 (2024).

  • Hayes, T. et al. Simulating 500 million years of evolution with a language model. Science 387, 850–858 (2025).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Su, J. et al. Democratizing protein language model training, sharing and collaboration. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02859-7 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Yang, J. et al. Improved protein structure prediction using predicted interresidue orientations. Proc. Natl Acad. Sci. USA 117, 1496–1503 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Zhou, X. et al. I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat. Protoc. 17, 2326–2353 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Zheng, W. et al. Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data. Nat. Methods 21, 279–289 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Rapp, J. T., Bremer, B. J. & Romero, P. A. Self-driving laboratories to autonomously navigate the protein fitness landscape. Nat. Chem. Eng. 1, 97–107 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Jiang, K. et al. Rapid in silico directed evolution by a protein language model with EVOLVEpro. Science 387, eadr6006 (2024).

    Article 

    Google Scholar 

  • Shanker, V. R., Bruun, T. U. J., Hie, B. L. & Kim, P. S. Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science 385, 46–53 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • He, Y. et al. Protein language models-assisted optimization of a uracil-N-glycosylase variant enables programmable T-to-G and T-to-C base editing. Mol. Cell 84, 1257–1270 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Hollmann, N. et al. Accurate predictions on small data with a tabular foundation model. Nature 637, 319–326 (2025).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Yeh, A. H.-W. et al. De novo design of luciferases using deep learning. Nature 614, 774–780 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Kortemme, T. De novo protein design—from new structures to programmable functions. Cell 187, 526–544 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Lu, L. et al. De novo design of drug-binding proteins with predictable binding energy and specificity. Science 384, 106–112 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Vázquez Torres, S. et al. De novo designed proteins neutralize lethal snake venom toxins. Nature 639, 225–231 (2025).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Freschlin, C. R., Fahlberg, S. A. & Romero, P. A. Machine learning to navigate fitness landscapes for protein engineering. Curr. Opin. Biotechnol. 75, 102713 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Ye, L. et al. Glycosylase-based base editors for efficient T-to-G and C-to-G editing in mammalian cells. Nat. Biotechnol. 42, 1538–1547 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Tong, H. et al. Development of deaminase-free T-to-S base editor and C-to-G base editor by engineered human uracil DNA glycosylase. Nat. Commun. 15, 4897 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Hie, B. L. et al. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 42, 275–283 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Tang, W. & Liu, D. R. Rewritable multi-event analog recording in bacterial and mammalian cells. Science 360, eaap8992 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Yu, Y. et al. Cytosine base editors with minimized unguided DNA and RNA off-target events and high on-target activity. Nat. Commun. 11, 2052 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Doman, J. L., Raguram, A., Newby, G. A. & Liu, D. R. Evaluation and minimization of Cas9-independent off-target DNA editing by cytosine base editors. Nat. Biotechnol. 38, 620–628 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Mol, C. D. et al. Crystal structure of human uracil-DNA glycosylase in complex with a protein inhibitor: protein mimicry of DNA. Cell 82, 701–708 (1995).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Wang, L. et al. Enhanced base editing by co-expression of free uracil DNA glycosylase inhibitor. Cell Res. 27, 1289–1292 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Huang, Y. et al. Genetic code expansion: recent developments and emerging applications. Chem. Rev. 125, 523–598 (2025).

    Article 
    PubMed 

    Google Scholar 

  • Osgood, A. O., Huang, Z., Szalay, K. H. & Chatterjee, A. Strategies to expand the genetic code of mammalian cells. Chem. Rev. 125, 2474–2501 (2025).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Hu, Z. et al. Discovery and engineering of small SlugCas9 with broad targeting range and high specificity and activity. Nucleic Acids Res. 49, 4008–4019 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Seo, S.-Y. et al. Massively parallel evaluation and computational prediction of the activities and specificities of 17 small Cas9s. Nat. Methods 20, 999–1009 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Qi, T. et al. Phage-assisted evolution of compact Cas9 variants targeting a simple NNG PAM. Nat. Chem. Biol. 20, 344–352 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Putnam, C. D. et al. Protein mimicry of DNA from crystal structures of the uracil-DNA glycosylase inhibitor protein and its complex with Escherichia coli uracil-DNA glycosylase1. J. Mol. Biol. 287, 331–346 (1999).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Karzai, A. W., Roche, E. D. & Sauer, R. T. The SsrA–SmpB system for protein tagging, directed degradation and ribosome rescue. Nat. Struct. Mol. Biol. 7, 449–455 (2000).

    Article 
    CAS 

    Google Scholar 

  • Klimecka, M. M. et al. A uniform benchmark for testing SsrA-derived degrons in the Escherichia coli ClpXP degradation pathway. Molecules 26, 5936 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Thuronyi, B. W. et al. Continuous evolution of base editors with expanded target compatibility and improved activity. Nat. Biotechnol. 37, 1070–1079 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Neugebauer, M. E. et al. Evolution of an adenine base editor into a small, efficient cytosine base editor with low off-target activity. Nat. Biotechnol. 41, 673–685 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Zhang, E., Neugebauer, M. E., Krasnow, N. A. & Liu, D. R. Phage-assisted evolution of highly active cytosine base editors with enhanced selectivity and minimal sequence context preference. Nat. Commun. 15, 1697 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Nishimasu, H. et al. Engineered CRISPR–Cas9 nuclease with expanded targeting space. Science 361, 1259–1262 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Horvath, P. et al. Diversity, activity, and evolution of CRISPR loci in Streptococcus thermophilus. J. Bacteriol. 190, 1401–1412 (2008).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Hou, Z. et al. Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proc. Natl Acad. Sci. USA 110, 15644–15649 (2013).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Harrington, L. B. et al. A thermostable Cas9 with increased lifetime in human plasma. Nat. Commun. 8, 1424 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Agudelo, D. et al. Versatile and robust genome editing with Streptococcus thermophilus CRISPR1–Cas9. Genome Res. 30, 107–117 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Legut, M. et al. High-throughput screens of PAM-flexible Cas9 variants for gene knockout and transcriptional modulation. Cell Rep. 30, 2859–2868 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chen, L. et al. Re-engineering the adenine deaminase TadA-8e for efficient and specific CRISPR-based cytosine base editing. Nat. Biotechnol. 41, 663–672 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Yan, H. & Tang, W. Programmed RNA editing with an evolved bacterial adenosine deaminase. Nat. Chem. Biol. 20, 1361–1370 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Xiao, Y.-L., Wu, Y. & Tang, W. An adenine base editor variant expands context compatibility. Nat. Biotechnol. 42, 1442–1453 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Ibba, M. & Söll, D. Aminoacyl-tRNA synthesis. Annu. Rev. Biochem. 69, 617–650 (2000).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Liu, C. C. & Schultz, P. G. Adding new chemistries to the genetic code. Annu. Rev. Biochem. 79, 413–444 (2010).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Chin, J. W. Expanding and reprogramming the genetic code of cells and animals. Annu. Rev. Biochem. 83, 379–408 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Guo, Y. et al. Biosynthesis of halogenated tryptophans for protein engineering using genetic code expansion. ChemBioChem 25, e202400366 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Hu, Y. et al. Biosynthesis of unnatural cyclodipeptides through genetic code expansion and cyclodipeptide synthase evolution. J. Am. Chem. Soc. 147, 34517–34526 (2025).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Hu, Y. et al. Engineering unnatural cells with a 21st amino acid as a living epigenetic sensor. Nat. Commun. 16, 9388 (2025).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Cheng, L., Wang, Y., Guo, Y., Zhang, S. S. & Xiao, H. Advancing protein therapeutics through proximity-induced chemistry. Cell Chem. Biol. 31, 428–445 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Chen, Y. et al. Unleashing the potential of noncanonical amino acid biosynthesis to create cells with precision tyrosine sulfation. Nat. Commun. 13, 5434 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Zhang, M. et al. Harnessing nature-inspired catechol amino acid to engineer sticky proteins and bacteria. Small Methods 8, 2400230 (2024).

    Article 
    CAS 

    Google Scholar 

  • Yang, S. et al. Real-time imaging of protein microenvironment changes in cells with rotor-based fluorescent amino acids. Nat. Chem. Biol. 22, 97–108 (2026).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Bryson, D. I. et al. Continuous directed evolution of aminoacyl-tRNA synthetases. Nat. Chem. Biol. 13, 1253–1260 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Wilkins, B. J. et al. Genetically encoding lysine modifications on histone H4. ACS Chem. Biol. 10, 939–944 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Miao, H., Yu, C., Yao, A. & Xuan, W. Rational design of a function-based selection method for genetically encoding acylated lysine derivatives. Org. Biomol. Chem. 17, 6127–6130 (2019).

    Article 
    PubMed 

    Google Scholar 

  • Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Smola, M. J., Rice, G. M., Busan, S., Siegfried, N. A. & Weeks, K. M. Selective 2′-hydroxyl acylation analyzed by primer extension and mutational profiling (SHAPE-MaP) for direct, versatile and accurate RNA structure analysis. Nat. Protoc. 10, 1643–1669 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chen, S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2, e107 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • van Kempen, M. et al. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 42, 243–246 (2024).

    Article 
    PubMed 

    Google Scholar 

  • Cheng, L., Zheng, C. & Jiang, S. SophieSarceau/SequenceDisplay-ML: SequenceDisplay-ML. Zenodo https://doi.org/10.5281/zenodo.18850384 (2026).

  • Cheng, L., Ding, H., Jiang, S., Zheng, X. & Xiao, H. Raw Illumina sequencing data for the large-scale 5NNK SlugCas9 sequence–activity dataset. Zenodo https://doi.org/10.5281/zenodo.18839434 (2026).

  • Related Articles

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Back to top button