Your browser doesn't support javascript.
loading
Pasa: leveraging population pangenome graph to scaffold prokaryote genome assemblies.
Do, Van Hoan; Nguyen, Son Hoang; Le, Duc Quang; Nguyen, Tam Thi; Nguyen, Canh Hao; Ho, Tho Huu; Vo, Nam S; Nguyen, Trang; Nguyen, Hoang Anh; Cao, Minh Duc.
Afiliação
  • Do VH; Center for Applied Mathematics and Informatics, Le Quy Don Technical University, Hanoi, Vietnam.
  • Nguyen SH; AMROMICS JSC, Nghe An, Vietnam.
  • Le DQ; Faculty of IT, Hanoi University of Civil Engineering, Hanoi, Vietnam.
  • Nguyen TT; Oxford University Clinical Research Unit, Hanoi, Vietnam.
  • Nguyen CH; Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan.
  • Ho TH; Department of Medical Microbiology, The 103 Military Hospital, Vietnam Military Medical University, Hanoi, Vietnam.
  • Vo NS; Department of Genomics & Cytogenetics, Institute of Biomedicine & Pharmacy, Vietnam Military Medical University, Hanoi, Vietnam.
  • Nguyen T; Center for Biomedical Informatics, Vingroup Big Data Institute, Hanoi, Vietnam.
  • Nguyen HA; AMROMICS JSC, Nghe An, Vietnam.
  • Cao MD; AMROMICS JSC, Nghe An, Vietnam.
Nucleic Acids Res ; 52(3): e15, 2024 Feb 09.
Article em En | MEDLINE | ID: mdl-38084888
ABSTRACT
Whole genome sequencing has increasingly become the essential method for studying the genetic mechanisms of antimicrobial resistance and for surveillance of drug-resistant bacterial pathogens. The majority of bacterial genomes sequenced to date have been sequenced with Illumina sequencing technology, owing to its high-throughput, excellent sequence accuracy, and low cost. However, because of the short-read nature of the technology, these assemblies are fragmented into large numbers of contigs, hindering the obtaining of full information of the genome. We develop Pasa, a graph-based algorithm that utilizes the pangenome graph and the assembly graph information to improve scaffolding quality. By leveraging the population information of the bacteria species, Pasa is able to utilize the linkage information of the gene families of the species to resolve the contig graph of the assembly. We show that our method outperforms the current state of the arts in terms of accuracy, and at the same time, is computationally efficient to be applied to a large number of existing draft assemblies.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Bactérias / Algoritmos / Genoma Bacteriano Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Bactérias / Algoritmos / Genoma Bacteriano Idioma: En Ano de publicação: 2024 Tipo de documento: Article