Your browser doesn't support javascript.
loading
Improved sequence mapping using a complete reference genome and lift-over.
Chen, Nae-Chyun; Paulin, Luis F; Sedlazeck, Fritz J; Koren, Sergey; Phillippy, Adam M; Langmead, Ben.
  • Chen NC; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA. cnaechy1@jhu.edu.
  • Paulin LF; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
  • Sedlazeck FJ; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
  • Koren S; Department of Computer Science, Rice University, Houston, TX, USA.
  • Phillippy AM; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
  • Langmead B; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
Nat Methods ; 21(1): 41-49, 2024 Jan.
Article en En | MEDLINE | ID: mdl-38036856
ABSTRACT
Complete, telomere-to-telomere (T2T) genome assemblies promise improved analyses and the discovery of new variants, but many essential genomic resources remain associated with older reference genomes. Thus, there is a need to translate genomic features and read alignments between references. Here we describe a method called levioSAM2 that performs fast and accurate lift-over between assemblies using a whole-genome map. In addition to enabling the use of several references, we demonstrate that aligning reads to a high-quality reference (for example, T2T-CHM13) and lifting to an older reference (for example, Genome reference Consortium (GRC)h38) improves the accuracy of the resulting variant calls on the old reference. By leveraging the quality improvements of T2T-CHM13, levioSAM2 reduces small and structural variant calling errors compared with GRC-based mapping using real short- and long-read datasets. Performance is especially improved for a set of complex medically relevant genes, where the GRC references are lower quality.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Genoma / Genómica Idioma: En Año: 2024 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Genoma / Genómica Idioma: En Año: 2024 Tipo del documento: Article