High-resolution metagenome assembly for modern long reads with myloasm

Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833–844 (2017).
Google Scholar
Hug, L. A. et al. A new view of the tree of life. Nat. Microbiol. 1, 1–6 (2016).
Google Scholar
Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).
Google Scholar
Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).
Google Scholar
Zhao, S. et al. Adaptive evolution within gut microbiomes of healthy people. Cell Host Microbe 25, 656–667 (2019).
Google Scholar
Pérez-Cobas, A. E., Gomez-Valero, L. & Buchrieser, C. Metagenomic approaches in microbial ecology: an update on whole-genome and marker gene sequencing analyses. Microb. Genom. 6, mgen000409 (2020).
Google Scholar
Kiefl, E. et al. Structure-informed microbial population genetics elucidate selective pressures that shape protein evolution. Sci. Adv. 9, eabq4632 (2023).
Google Scholar
Wallen, Z. D. et al. Metagenomics of Parkinson’s disease implicates the gut microbiome in multiple disease mechanisms. Nat. Commun. 13, 6958 (2022).
Google Scholar
Tisza, M. J. & Buck, C. B. A catalog of tens of thousands of viruses from human metagenomes reveals hidden associations with chronic diseases. Proc. Natl Acad. Sci. USA 118, e2023202118 (2021).
Google Scholar
Franzosa, E. A. et al. Gut microbiome structure and metabolic activity in inflammatory bowel disease. Nat. Microbiol. 4, 293–305 (2019).
Google Scholar
Schmidt, T. S. B. et al. Drivers and determinants of strain dynamics following fecal microbiota transplantation. Nat. Med. 28, 1902–1912 (2022).
Google Scholar
Bedarf, J. R. et al. Functional implications of microbial and viral gut metagenome changes in early stage L-DOPA-naïve Parkinson’s disease patients. Genome Med. 9, 39 (2017).
Google Scholar
Woodcroft, B. J. et al. Genome-centric view of carbon processing in thawing permafrost. Nature 560, 49–54 (2018).
Google Scholar
Ustick, L. J. et al. Metagenomic analysis reveals global-scale patterns of ocean nutrient limitation. Science 372, 287–291 (2021).
Google Scholar
Liang, J.-L. et al. Novel phosphate-solubilizing bacteria enhance soil phosphorus cycling following ecological restoration of land degraded by mining. ISME J. 14, 1600–1613 (2020).
Google Scholar
Cavicchioli, R. et al. Scientists’ warning to humanity: microorganisms and climate change. Nat. Rev. Microbiol. 17, 569–586 (2019).
Google Scholar
Steen, A. D. et al. High proportions of bacteria and archaea across most biomes remain uncultured. ISME J. 13, 3126–3130 (2019).
Google Scholar
Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).
Google Scholar
Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).
Google Scholar
Benoit, G. et al. High-quality metagenome assembly from long accurate reads with metaMDBG. Nat. Biotechnol. 42, 1378–1383 (2024).
Feng, X., Cheng, H., Portik, D. & Li, H. Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nat. Methods 19, 671–674 (2022).
Google Scholar
Agustinho, D. P. et al. Unveiling microbial diversity: harnessing long-read sequencing technology. Nat. Methods 21, 954–966 (2024).
Google Scholar
Feng, X. & Li, H. Evaluating and improving the representation of bacterial contents in long-read metagenome assemblies. Genome Biol. 25, 92 (2024).
Google Scholar
Crits-Christoph, A., Olm, M. R., Diamond, S., Bouma-Gregson, K. & Banfield, J. F. Soil bacterial populations are shaped by recombination and gene-specific selection across a grassland meadow. ISME J. 14, 1834–1846 (2020).
Google Scholar
Liu, Z. & Good, B. H. Dynamics of bacterial recombination in the human gut microbiome. PLOS Biol. 22, e3002472 (2024).
Google Scholar
Chen-Liaw, A. et al. Gut microbiota strain richness is species specific and affects engraftment. Nature 637, 422–429 (2025).
Google Scholar
Goyal, A., Bittleston, L. S., Leventhal, G. E., Lu, L. & Cordero, O. X. Interactions between strains govern the eco-evolutionary dynamics of microbial communities. eLife 11, e74987 (2022).
Google Scholar
Brito, I. L. Examining horizontal gene transfer in microbial communities. Nat. Rev. Microbiol. 19, 442–453 (2021).
Google Scholar
Nagarajan, N. & Pop, M. Parametric complexity of sequence assembly: theory and applications to next generation sequencing. J. Comput. Biol. 16, 897–908 (2009).
Google Scholar
Bresler, G., Bresler, M. & Tse, D. Optimal assembly for high throughput shotgun sequencing. BMC Bioinformatics 14, S18 (2013).
Google Scholar
Kerkvliet, J. J. et al. Metagenomic assembly is the main bottleneck in the identification of mobile genetic elements. PeerJ 12, e16695 (2024).
Google Scholar
Nurk, S. et al. HiCanu: Accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. 30, 1291–1305 (2020).
Google Scholar
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Google Scholar
Minich, J. J. et al. Culture-independent meta-pangenomics enabled by long-read metagenomics reveals associations with pediatric undernutrition. Cell 188, 6666–6686 (2025).
Sereika, M. et al. Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. Nat. Methods 19, 823–826 (2022).
Google Scholar
Cheng, H. et al. Efficient near-telomere-to-telomere assembly of nanopore simplex reads. Nature https://doi.org/10.1038/s41586-026-10105-6 (2026).
Hall, M. B. et al. Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data. eLife 13, RP98300 (2024).
Google Scholar
Myers, E. W. The fragment assembly string graph. Bioinformatics 21 Suppl 2, ii79–85 (2005).
Google Scholar
Compeau, P. E. C., Pevzner, P. A. & Tesler, G. Why are de Bruijn graphs useful for genome assembly? Nat. Biotechnol. 29, 987–991 (2011).
Google Scholar
Ekim, B., Berger, B. & Chikhi, R. Minimizer-space de Bruijn graphs: whole-genome assembly of long reads in minutes on a personal computer. Cell Syst. 12, 958–968 (2021).
Benoit, G. et al. High-quality metagenome assembly from nanopore reads with nanoMDBG. Nat. Commun. https://doi.org/10.1038/s41467-026-69760-y (2026).
Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).
Google Scholar
Trigodet, F., Sachdeva, R., Banfield, J. F. & Eren, A. M. Troubleshooting common errors in assemblies of long-read metagenomes. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02971-8 (2026).
Derelle, R. et al. Seamless, rapid and accurate analyses of outbreak genomic data using split k-mer analysis. Genome Res. 34, 1661–1673 (2024).
Google Scholar
Gardner, S. N. & Hall, B. G. When whole-genome alignments just won’t work: kSNP v2 software for alignment-free SNP discovery and phylogenetics of hundreds of microbial genomes. PLoS ONE 8, e81760 (2013).
Google Scholar
Harris, S. R. SKA: split kmer analysis toolkit for bacterial genomic epidemiology. Preprint at bioRxiv https://doi.org/10.1101/453142 (2018).
Edgar, R. Syncmers are more sensitive than minimizers for selecting conserved k-mers in biological sequences. PeerJ 9, e10805 (2021).
Google Scholar
Myers, G. & Miller, W. Chaining multiple-alignment fragments in sub-quadratic time. In Proc. Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (ed. Clarkson, K. L.) 38–47 (SIAM, 1995).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Google Scholar
Li, H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32, 2103–2110 (2016).
Google Scholar
Bouras, G. et al. Hybracter: enabling scalable, automated, complete and accurate bacterial genome assemblies. Microb. Genom. 10, 001244 (2024).
Google Scholar
Vaisbourd, E., Bren, A., Alon, U. & Glass, D. S. Preventing multimer formation in commonly used synthetic biology plasmids. ACS Synth. Biol. 14, 1309–1315 (2025).
Google Scholar
Kiguchi, Y. et al. Giant extrachromosomal element ‘Inocle’ potentially expands the adaptive capacity of the human oral microbiome. Nat. Commun. 16, 7397 (2025).
Google Scholar
Sereika, M. et al. Genome-resolved long-read sequencing expands known microbial diversity across terrestrial habitats. Nat. Microbiol. 10, 2018–2030 (2025).
Google Scholar
Gehrig, J. L. et al. Finding the right fit: evaluation of short-read and long-read sequencing approaches to maximize the utility of clinical microbiome data. Microb. Genom. 8, 000794 (2022).
Google Scholar
Sidhu, C. et al. Dissolved storage glycans shaped the community composition of abundant bacterioplankton clades during a North Sea spring phytoplankton bloom. Microbiome 11, 77 (2023).
Google Scholar
Priest, T., Orellana, L. H., Huettel, B., Fuchs, B. M. & Amann, R. Microbial metagenome-assembled genomes of the Fram Strait from short and long read sequencing platforms. PeerJ 9, e11721 (2021).
Google Scholar
Kato, S., Masuda, S., Shibata, A., Shirasu, K. & Ohkuma, M. Insights into ecological roles of uncultivated bacteria in Katase hot spring sediment from long-read metagenomics. Front. Microbiol. 13, 1045931 (2022).
Zhang, Y. et al. Improved microbial genomes and gene catalog of the chicken gut from metagenomic sequencing of high-fidelity long reads. Gigascience 11, giac116 (2022).
Google Scholar
Chklovski, A., Parks, D. H., Woodcroft, B. J. & Tyson, G. W. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat. Methods 20, 1203–1212 (2023).
Google Scholar
Camargo, A. P. et al. Identification of mobile genetic elements with geNomad. Nat. Biotechnol. 42, 1303–1312 (2024).
Google Scholar
Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T. & Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).
Google Scholar
Blanco-Míguez, A. et al. Extension of the Segatella copri complex to 13 species with distinct large extrachromosomal elements and associations with host conditions. Cell Host Microbe 31, 1804–1819 (2023).
Google Scholar
Chang, H.-W. et al. Prevotella copri and microbiota members mediate the beneficial effects of a therapeutic food for malnutrition. Nat. Microbiol. 9, 922–937 (2024).
Google Scholar
Maguire, F. et al. Metagenome-assembled genome binning methods with short reads disproportionately fail for plasmids and genomic Islands. Microb. Genom. 6, mgen000436 (2020).
Google Scholar
Abramova, A., Karkman, A. & Bengtsson-Palme, J. Metagenomic assemblies tend to break around antibiotic resistance genes. BMC Genomics 25, 959 (2024).
Google Scholar
Xing, L. et al. ErmF and ereD Are Responsible for Erythromycin Resistance in Riemerella anatipestifer. PLoS ONE 10, e0131078 (2015).
Google Scholar
Huttenhower, C. et al. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214 (2012).
Google Scholar
He, X. et al. Cultivation of a human-associated TM7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle. Proc. Natl Acad. Sci. USA 112, 244–249 (2015).
Google Scholar
Kazantseva, E., Donmez, A., Frolova, M., Pop, M. & Kolmogorov, M. Strainy: phasing and assembly of strain haplotypes from long-read metagenome sequencing. Nat. Methods 21, 2034–2043 (2024).
Shaw, J., Gounot, J.-S., Chen, H., Nagarajan, N. & Yu, Y. W. Floria: fast and accurate strain haplotyping in metagenomes. Bioinformatics 40, i30–i38 (2024).
Google Scholar
Jochheim, A. et al. Strain-resolved de-novo metagenomic assembly of viral genomes and microbial 16S rRNAs. Microbiome 12, 187 (2024).
Google Scholar
Grigoriev, A. Analyzing genomes with cumulative skew diagrams. Nucleic Acids Res. 26, 2286–2290 (1998).
Google Scholar
Schmidt, S., Toivonen, S., Medvedev, P. & Tomescu, A. I. Applying the safe-and-complete framework to practical genome assembly. Leibniz Int. Proc. Inform. 312, 8 (2024).
Dabbaghie, F., Ebler, J. & Marschall, T. BubbleGun: enumerating bubbles and superbubbles in genome graphs. Bioinformatics 38, 4217–4219 (2022).
Google Scholar
Lancia, G., Bafna, V., Istrail, S., Lippert, R. & Schwartz, R. SNPs problems, complexity, and algorithms. In Proc. 9th Annual European Symposium on Algorithms (ed. auf der Heide, F. M.) 182–193 (Springer, 2001).
Chaung, K. et al. SPLASH: A statistical, reference-free genomic algorithm unifies biological discovery. Cell 186, 5440–5456 (2023).
Google Scholar
Ondov, B. D. et al. Mash: Fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016).
Google Scholar
Liu, X. et al. Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications. Genome Res. 34, 2025–2038 (2024).
Google Scholar
Delahaye, C. & Nicolas, J. Sequencing DNA with nanopores: troubles and biases. PLoS ONE 16, e0257521 (2021).
Google Scholar
Roberts, M., Hayes, W., Hunt, B. R., Mount, S. M. & Yorke, J. A. Reducing storage requirements for biological sequence comparison. Bioinformatics 20, 3363–3369 (2004).
Google Scholar
Shaw, J. & Yu, Y. W. Theory of local k-mer selection with applications to long-read alignment. Bioinformatics 38, 4659–4669 (2022).
Google Scholar
Belbasi, M., Blanca, A., Harris, R. S., Koslicki, D. & Medvedev, P. The minimizer Jaccard estimator is biased and inconsistent. Bioinformatics 38, i169–i176 (2022).
Google Scholar
Frith, M. C., Shaw, J. & Spouge, J. L. How to optimally sample a sequence for rapid analysis. Bioinformatics 39, btad057 (2023).
Google Scholar
Shaw, J. & Yu, Y. W. Proving sequence aligners can guarantee accuracy in almost O(m log n) time through an average-case analysis of the seed-chain-extend heuristic. Genome Res. 33, 1175–1187 (2023).
Chen, J.-Q. et al. Variation in the ratio of nucleotide substitution and indel rates across genomes in mammals and bacteria. Mol. Biol. Evol.26, 1523–1531 (2009).
Google Scholar
Spouge, J. L., Das, P., Chen, Y. & Frith, M. The statistics of parametrized syncmers in a simple mutation process without spurious matches. J. Comput. Biol. 31, 1195–1210 (2024).
Google Scholar
Hoeffding, W. & Robbins, H. The central limit theorem for dependent random variables. Duke Math. J. 15, 773–780 (1948).
Google Scholar
Stanojević, D., Lin, D., de Sessions, P. F. & Šikić, M. Telomere-to-telomere phased genome assembly using error-corrected Simplex nanopore reads. Preprint at bioRxiv https://doi.org/10.1101/2024.05.18.594796 (2024).
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
Google Scholar
Tan, K.-T., Slevin, M. K., Meyerson, M. & Li, H. Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres. Genome Biol. 23, 180 (2022).
Google Scholar
Jain, C. Coverage-preserving sparsification of overlap graphs for long-read assembly. Bioinformatics 39, btad124 (2023).
Google Scholar
Li, H. & Durbin, R. Genome assembly in the telomere-to-telomere era. Nat. Rev. Genet. 25, 658–670 (2024).
Google Scholar
Blanca, A., Harris, R. S., Koslicki, D. & Medvedev, P. The statistics of k-mers from a sequence undergoing a simple mutation process without spurious matches. J. Comput. Biol. 29, 155–168 (2022).
Google Scholar
Liu, D. & Steinegger, M. Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices. Bioinformatics 39, btad487 (2023).
Google Scholar
Vaser, R., Sović, I., Nagarajan, N. & Šikić, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
Google Scholar
Lee, C., Grasso, C. & Sharlow, M. F. Multiple sequence alignment using partial order graphs. Bioinformatics 18, 452–464 (2002).
Google Scholar
Shaw, J. & Yu, Y. W. Fast and robust metagenomic sequence comparison through sparse chaining with skani. Nat. Methods 20, 1661–1665 (2023).
Kruchten, N., Seier, A. & Parmer, C. An interactive, open-source, and browser-based graphing library for Python. Zenodo https://doi.org/10.5281/zenodo.14503524 (2025).
Köster, J. & Rahmann, S. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics 28, 2520–2522 (2012).
Google Scholar
O’Leary, N. A. et al. Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets. Sci. Data 11, 732 (2024).
Google Scholar
Wick, R. R. Badread: simulation of error-prone long reads. J. Open Source Softw. 4, 1316 (2019).
Google Scholar
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
Google Scholar
Mikheenko, A., Saveliev, V. & Gurevich, A. MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 32, 1088–1090 (2016).
Google Scholar
Pan, S., Zhao, X.-M. & Coelho, L. P. SemiBin2: self-supervised contrastive learning leads to better MAGs for short- and long-read sequencing. Bioinformatics 39, i21–i29 (2023).
Google Scholar
Eren, A. M. et al. Anvi’o: an advanced analysis and visualization platform for ‘omics data. PeerJ 3, e1319 (2015).
Google Scholar
Rahman Hera, M., Pierce-Ward, N. T. & Koslicki, D. Deriving confidence intervals for mutation rates across a wide range of evolutionary distances using FracMinHash. Genome Res. 33, 1061–1068 (2023).
Google Scholar
Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2020).
Google Scholar
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
Google Scholar
Alcock, B. P. et al. CARD 2023: expanded curation, support for machine learning, and resistome prediction at the Comprehensive Antibiotic Resistance Database. Nucleic Acids Res. 51, D690–D699 (2023).
Google Scholar
Schwengers, O. et al. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb. Genom. 7, 000685 (2021).
Google Scholar
Bouras, G., Grigson, S. R., Papudeshi, B., Mallawaarachchi, V. & Roach, M. J. Dnaapler: a tool to reorient circular microbial genomes. J. Open Source Softw. 9, 5968 (2024).
Google Scholar
Gilchrist, C. L. M. & Chooi, Y.-H. Clinker & clustermap.js: automatic generation of gene cluster comparison figures. Bioinformatics 37, 2473–2475 (2021).
Google Scholar
Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
Google Scholar
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2022).
Google Scholar
Kirkegaard, R. & Albertsen, M. MicroBench: nanopore data for microbial genomic benchmarking. Zenodo https://doi.org/10.5281/zenodo.18492140 (2026).




