Your browser doesn't support javascript.
loading
Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes.
Ebler, Jana; Ebert, Peter; Clarke, Wayne E; Rausch, Tobias; Audano, Peter A; Houwaart, Torsten; Mao, Yafei; Korbel, Jan O; Eichler, Evan E; Zody, Michael C; Dilthey, Alexander T; Marschall, Tobias.
Afiliação
  • Ebler J; Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
  • Ebert P; Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
  • Clarke WE; New York Genome Center, New York, NY, USA.
  • Rausch T; European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.
  • Audano PA; European Molecular Biology Laboratory, GeneCore, Heidelberg, Germany.
  • Houwaart T; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Mao Y; Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
  • Korbel JO; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Eichler EE; European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.
  • Zody MC; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Dilthey AT; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
  • Marschall T; New York Genome Center, New York, NY, USA.
Nat Genet ; 54(4): 518-525, 2022 04.
Article em En | MEDLINE | ID: mdl-35410384
ABSTRACT
Typical genotyping workflows map reads to a reference genome before identifying genetic variants. Generating such alignments introduces reference biases and comes with substantial computational burden. Furthermore, short-read lengths limit the ability to characterize repetitive genomic regions, which are particularly challenging for fast k-mer-based genotypers. In the present study, we propose a new algorithm, PanGenie, that leverages a haplotype-resolved pangenome reference together with k-mer counts from short-read sequencing data to genotype a wide spectrum of genetic variation-a process we refer to as genome inference. Compared with mapping-based approaches, PanGenie is more than 4 times faster at 30-fold coverage and achieves better genotype concordances for almost all variant types and coverages tested. Improvements are especially pronounced for large insertions (≥50 bp) and variants in repetitive regions, enabling the inclusion of these classes of variants in genome-wide association studies. PanGenie efficiently leverages the increasing amount of haplotype-resolved assemblies to unravel the functional impact of previously inaccessible variants while being faster compared with alignment-based workflows.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Limite: Humans Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Limite: Humans Idioma: En Ano de publicação: 2022 Tipo de documento: Article