Your browser doesn't support javascript.
loading
Pan-African genome demonstrates how population-specific genome graphs improve high-throughput sequencing data analysis.
Tetikol, H Serhat; Turgut, Deniz; Narci, Kubra; Budak, Gungor; Kalay, Ozem; Arslan, Elif; Demirkaya-Budak, Sinem; Dolgoborodov, Alexey; Kabakci-Zorlu, Duygu; Semenyuk, Vladimir; Jain, Amit; Davis-Dusenbery, Brandi N.
Afiliación
  • Tetikol HS; Seven Bridges Genomics, Charlestown, MA, USA. serhat.tetikol@sevenbridges.com.
  • Turgut D; Seven Bridges Genomics, Charlestown, MA, USA.
  • Narci K; Seven Bridges Genomics, Charlestown, MA, USA.
  • Budak G; Seven Bridges Genomics, Charlestown, MA, USA.
  • Kalay O; Seven Bridges Genomics, Charlestown, MA, USA.
  • Arslan E; Seven Bridges Genomics, Charlestown, MA, USA.
  • Demirkaya-Budak S; Seven Bridges Genomics, Charlestown, MA, USA.
  • Dolgoborodov A; Seven Bridges Genomics, Charlestown, MA, USA.
  • Kabakci-Zorlu D; Seven Bridges Genomics, Charlestown, MA, USA.
  • Semenyuk V; Seven Bridges Genomics, Charlestown, MA, USA.
  • Jain A; Seven Bridges Genomics, Charlestown, MA, USA.
  • Davis-Dusenbery BN; Seven Bridges Genomics, Charlestown, MA, USA.
Nat Commun ; 13(1): 4384, 2022 08 04.
Article en En | MEDLINE | ID: mdl-35927245
Graph-based genome reference representations have seen significant development, motivated by the inadequacy of the current human genome reference to represent the diverse genetic information from different human populations and its inability to maintain the same level of accuracy for non-European ancestries. While there have been many efforts to develop computationally efficient graph-based toolkits for NGS read alignment and variant calling, methods to curate genomic variants and subsequently construct genome graphs remain an understudied problem that inevitably determines the effectiveness of the overall bioinformatics pipeline. In this study, we discuss obstacles encountered during graph construction and propose methods for sample selection based on population diversity, graph augmentation with structural variants and resolution of graph reference ambiguity caused by information overload. Moreover, we present the case for iteratively augmenting tailored genome graphs for targeted populations and demonstrate this approach on the whole-genome samples of African ancestry. Our results show that population-specific graphs, as more representative alternatives to linear or generic graph references, can achieve significantly lower read mapping errors and enhanced variant calling sensitivity, in addition to providing the improvements of joint variant calling without the need of computationally intensive post-processing steps.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Secuenciación de Nucleótidos de Alto Rendimiento / Análisis de Datos Límite: Humans Idioma: En Revista: Nat Commun Asunto de la revista: BIOLOGIA / CIENCIA Año: 2022 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Secuenciación de Nucleótidos de Alto Rendimiento / Análisis de Datos Límite: Humans Idioma: En Revista: Nat Commun Asunto de la revista: BIOLOGIA / CIENCIA Año: 2022 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Reino Unido