Your browser doesn't support javascript.
loading
HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies.
Edge, Peter; Bafna, Vineet; Bansal, Vikas.
Afiliación
  • Edge P; Department of Computer Science & Engineering, University of California, San Diego, La Jolla, California 92053, USA.
  • Bafna V; Department of Computer Science & Engineering, University of California, San Diego, La Jolla, California 92053, USA.
  • Bansal V; Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, California 92053, USA.
Genome Res ; 27(5): 801-812, 2017 05.
Article en En | MEDLINE | ID: mdl-27940952
Many tools have been developed for haplotype assembly-the reconstruction of individual haplotypes using reads mapped to a reference genome sequence. Due to increasing interest in obtaining haplotype-resolved human genomes, a range of new sequencing protocols and technologies have been developed to enable the reconstruction of whole-genome haplotypes. However, existing computational methods designed to handle specific technologies do not scale well on data from different protocols. We describe a new algorithm, HapCUT2, that extends our previous method (HapCUT) to handle multiple sequencing technologies. Using simulations and whole-genome sequencing (WGS) data from multiple different data types-dilution pool sequencing, linked-read sequencing, single molecule real-time (SMRT) sequencing, and proximity ligation (Hi-C) sequencing-we show that HapCUT2 rapidly assembles haplotypes with best-in-class accuracy for all data types. In particular, HapCUT2 scales well for high sequencing coverage and rapidly assembled haplotypes for two long-read WGS data sets on which other methods struggled. Further, HapCUT2 directly models Hi-C specific error modalities, resulting in significant improvements in error rates compared to HapCUT, the only other method that could assemble haplotypes from Hi-C data. Using HapCUT2, haplotype assembly from a 90× coverage whole-genome Hi-C data set yielded high-resolution haplotypes (78.6% of variants phased in a single block) with high pairwise phasing accuracy (∼98% across chromosomes). Our results demonstrate that HapCUT2 is a robust tool for haplotype assembly applicable to data from diverse sequencing technologies.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Haplotipos / Programas Informáticos / Análisis de Secuencia de ADN / Mapeo Contig / Genómica Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Genome Res Asunto de la revista: BIOLOGIA MOLECULAR / GENETICA Año: 2017 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Haplotipos / Programas Informáticos / Análisis de Secuencia de ADN / Mapeo Contig / Genómica Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Genome Res Asunto de la revista: BIOLOGIA MOLECULAR / GENETICA Año: 2017 Tipo del documento: Article País de afiliación: Estados Unidos