Your browser doesn't support javascript.
loading
Exploring high-quality microbial genomes by assembling short-reads with long-range connectivity.
Zhang, Zhenmiao; Xiao, Jin; Wang, Hongbo; Yang, Chao; Huang, Yufen; Yue, Zhen; Chen, Yang; Han, Lijuan; Yin, Kejing; Lyu, Aiping; Fang, Xiaodong; Zhang, Lu.
  • Zhang Z; Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
  • Xiao J; Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
  • Wang H; Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
  • Yang C; Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
  • Huang Y; BGI Research, Shenzhen, 518083, China.
  • Yue Z; BGI Research, Sanya, 572025, China.
  • Chen Y; State Key Laboratory of Dampness Syndrome of Chinese Medicine, The Second Affiliated Hospital of Guangzhou University of Chinese, Guangzhou, China.
  • Han L; Department of Scientific Research, Kangmeihuada GeneTech Co., Ltd (KMHD), Shenzhen, China.
  • Yin K; Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
  • Lyu A; Institute for Research and Continuing Education, Hong Kong Baptist University, Shenzhen, China.
  • Fang X; School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, China.
  • Zhang L; BGI Research, Shenzhen, 518083, China.
Nat Commun ; 15(1): 4631, 2024 May 31.
Article en En | MEDLINE | ID: mdl-38821971
ABSTRACT
Although long-read sequencing enables the generation of complete genomes for unculturable microbes, its high cost limits the widespread adoption of long-read sequencing in large-scale metagenomic studies. An alternative method is to assemble short-reads with long-range connectivity, which can be a cost-effective way to generate high-quality microbial genomes. Here, we develop Pangaea, a bioinformatic approach designed to enhance metagenome assembly using short-reads with long-range connectivity. Pangaea leverages connectivity derived from physical barcodes of linked-reads or virtual barcodes by aligning short-reads to long-reads. Pangaea utilizes a deep learning-based read binning algorithm to assemble co-barcoded reads exhibiting similar sequence contexts and abundances, thereby improving the assembly of high- and medium-abundance microbial genomes. Pangaea also leverages a multi-thresholding algorithm strategy to refine assembly for low-abundance microbes. We benchmark Pangaea on linked-reads and a combination of short- and long-reads from simulation data, mock communities and human gut metagenomes. Pangaea achieves significantly higher contig continuity as well as more near-complete metagenome-assembled genomes (NCMAGs) than the existing assemblers. Pangaea also generates three complete and circular NCMAGs on the human gut microbiomes.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Algoritmos / Metagenoma / Metagenómica / Genoma Microbiano / Microbioma Gastrointestinal Límite: Humans Idioma: En Año: 2024 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Algoritmos / Metagenoma / Metagenómica / Genoma Microbiano / Microbioma Gastrointestinal Límite: Humans Idioma: En Año: 2024 Tipo del documento: Article