Manufacturing-aware generative models enable petascale synthesis of designed DNA

0 0 4 minutes read

Manufacturing-aware generative models enable petascale synthesis of designed DNA

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

Russ, WP et al. An evolutionary-based model for the design of chorismate mutase enzymes. Science 369440-445 (2020).

Article CAS PubMed Google Scholar

Shin, J.-E. et al. Protein design and variant prediction using generative autoregressive models. Nat. Common. 122403 (2021).

Article CAS PubMed PubMed Central Google Scholar

Madani, A. et al. Large language models generate functional protein sequences in diverse families. Nat. Biotechnology. 411099-1106 (2023).

Article CAS PubMed PubMed Central Google Scholar

Ingraham, JB et al. Illuminating protein space with a programmable generative model. Nature 6231070-1078 (2023).

Article CAS PubMed PubMed Central Google Scholar

Watson, JL et al. De novo design of protein structure and function with RFdiffusion. Nature 6201089-1100 (2023).

Article CAS PubMed PubMed Central Google Scholar

Hopf, TA et al. Mutation effects predicted from sequence co-variation. Nat. Biotechnology. 35128-135 (2017).

Article CAS PubMed PubMed Central Google Scholar

Weinstein, EN, Amin, AN, Medical, H., Frazer, J. & Marks, DS Nonidentifiability and benefits of misspecification in molecular fitness models. In Proc. 36th International Conference on Neural Information Processing Systems (ed. Koyejo, S. et al.) (ACM, 2022).

Kosuri, S. & Church, GM Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11499-507 (2014).

Article CAS PubMed PubMed Central Google Scholar

Weinstein, EN et al. Optimal design of stochastic DNA synthesis protocols based on generative sequence models. In Proc. 25th International Conference on Artificial Intelligence and Statistics (ed. Camps-Valls, G. et al.) (PMLR, 2022).

Li, JQ and Barron, AR Estimation of mixture density. In Proc. 12th International Conference on Neural Information Processing Systems (ed. Kearns, MJ et al.) (ACM, 1999).

Richardson, E. & Weiss, Y. On GANs and GMMs. In Proc. 32nd International Conference on Neural Information Processing Systems (ed. Bengio, S. et al.) (ACM, 2022).

Olsen, TH, Boyles, F. & Deane, CM Observed Antibody Space: a diverse database of cleaned, annotated and translated unpaired and matched antibody sequences. Protein Sci. 31141-146 (2022).

Article CAS PubMed Google Scholar

Olsen, TH, Moal, IH and Deane, CM Addressing antibody germline bias and its effect on language models to improve antibody design. Bioinformatics 40btae618 (2024).

Article CAS PubMed PubMed Central Google Scholar

Amin, AN, Weinstein, EN & Marks, DS A nonparametric generative Bayesian model for whole genomes. In Proc. 35th International Conference on Neural Information Processing Systems (ed. Ranzato, M. et al.) (ACM, 2021).

Gretton, A., Borgwardt, KM, Rasch, MJ, Schölkopf, B. and Smola, A. A two-sample kernel test. J.Mach. Learn. Res. 13723-773 (2012).

Google Scholar

Amin, AN, Marks, DS & Weinstein, EN Biological sequence cores with guaranteed flexibility. J.Mach. Learn. Res. 261–63 (2025).

Google Scholar

Shuai, RW, Ruffolo, JA & Gray, JJ IgLM: filler language modeling for antibody sequence design. Cellular systems 14979-989.e4 (2023).

Article PubMed PubMed Central Google Scholar

Amin, AN, Weinstein, EN & Marks, DS A kernelized Stein divergence for biological sequences. In Proc. 40th International Conference on Machine Learning (ed. Krause, A. et al.) (PMLR, 2023).

Lloyd, JR and Ghahramani, Z. Critique of the statistical model using two kernel test samples. In Proc. 29th International Conference on Neural Information Processing Systems (ed. Cortes, C. et al.) (ACM, 2015).

Wermke, M. et al. Autologous T cell therapy for PRAME⁺ advanced solid tumors in HLA-A*02⁺ patients: a phase 1 trial. Nat. Med. 312365-2374 (2025).

Article PubMed PubMed Central Google Scholar

Reynisson, B., Alvarez, B., Paul, S., Peters, B. & Nielsen, M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by simultaneous motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 48W449-W454 (2020).

Article CAS PubMed PubMed Central Google Scholar

Nijkamp, E., Ruffolo, JA, Weinstein, E.N., Naik, N. & Madani, A. ProGen2: exploring the limits of protein language models. Cellular system. 14968-978 (2023).

Google Scholar

Gibson, DG et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6343-345 (2009).

Article CAS PubMed Google Scholar

Shumailov, I. et al. AI models collapse when trained on recursively generated data. Nature 631755-759 (2024).

Article CAS PubMed PubMed Central Google Scholar

Framework for screening nucleic acid synthesis (National Council of Science and Technology, 2024); https://aspr.hhs.gov/S3/Documents/OSTP-Nucleic-Acid-Synthesis-Screening-Framework-Sep2024.pdf

Baker, D. & Church, G. Protein design meets biosecurity. Science 383349 (2024).

Article PubMed Google Scholar

Baum, C. et al. A system capable of verifiably and privately filtering global DNA synthesis. Preprint at https://arxiv.org/abs/2403.14023 (2025).

Abdali, S., Anarfi, R., Barberan, CJ, He, J. and Shayegani, E. Securing large language models: threats, vulnerabilities and responsible practices. Preprint at https://arxiv.org/abs/2403.12503 (2024).

Weinstein, EN, Slabodkin, A., Gollub, MG & Wood, EB Accelerated learning on large-scale displays using generative library models. Preprint at https://arxiv.org/abs/2510.16612 (2025).

Weinstein, EN et al. Acquisition of lifting biomolecular data. Preprint at https://arxiv.org/abs/2512.15984 (2025).

Zhang, J., Kobert, K., Flouri, T. and Stamatakis, A. PEAR: fast and accurate illumina double-end read fusion. Bioinformatics 30614-620 (2014).

Article CAS PubMed Google Scholar

Daily, J. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics 1781 (2016).

Article PubMed PubMed Central Google Scholar

Jaravine, V., Mösch, A., Raffegerst, S., Schendel, DJ & Frishman, D. Expitope 2.0: a tool to evaluate immunotherapeutic antigens for their potential cross-reactivity against proteins naturally expressed in human tissues. Cancer BMC 17892 (2017).

Article PubMed PubMed Central Google Scholar

Vita, R. et al. The immune epitope database (iedb): 2018 update. Nucleic Acids Res. 47D339-D343 (2019).

Article CAS PubMed PubMed Central Google Scholar

Huszár, F. & Duvenaud, D. Optimal weighting breeding is Bayesian quadrature. In Proc. 28th Annual Conference on Uncertainty in Artificial Intelligence (ed. de Freitas, N. and Murphy, K.) (ACM, 2012).

McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2020).

Salimans, T. et al. Improved techniques for training GANs. In Proc. 30th International Conference on Neural Information Processing Systems (ed. Lee, DD et al.) (ACM, 2016).

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. & Hochreiter, S. GANs trained by an update rule at two time scales converge to a local Nash equilibrium. In Proc. 31st International Conference on Neural Information Processing Systems (ed. von Luxburg, U. et al.) (ACM, 2017).

Lefranc, deputy. et al. Unique IMGT numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily type V domains. Dev. Comp. Immunol. 2755-77 (2003).

Article CAS PubMed Google Scholar

Shen, S. et al. Probabilistic analysis of the frequencies of amino acid pairs within characterized protein sequences. Physics A 370651-662 (2006).

Article CAS PubMed PubMed Central Google Scholar

Rao, X., Fontaine Costa, AIC, van Baarle, D. & Kesmir, C. A comparative study of HLA binding affinity and ligand diversity: implications for the generation of immunodominant CD8⁺ T cell responses. J. Immunol. 1821526-1532 (2009).

Article CAS PubMed Google Scholar

Trolle, T. et al. The length distribution of class I-restricted T cell epitopes is determined by both peptide intake and MHC allele-specific binding preference. J. Immunol. 1961480-1487 (2016).

Article CAS PubMed PubMed Central Google Scholar

abdulmanannet77@gmail.com1 week ago

0 0 4 minutes read