Your browser doesn't support javascript.
loading
Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C.
Kronenberg, Zev N; Rhie, Arang; Koren, Sergey; Concepcion, Gregory T; Peluso, Paul; Munson, Katherine M; Porubsky, David; Kuhn, Kristen; Mueller, Kathryn A; Low, Wai Yee; Hiendleder, Stefan; Fedrigo, Olivier; Liachko, Ivan; Hall, Richard J; Phillippy, Adam M; Eichler, Evan E; Williams, John L; Smith, Timothy P L; Jarvis, Erich D; Sullivan, Shawn T; Kingan, Sarah B.
Afiliação
  • Kronenberg ZN; Phase Genomics, Seattle, WA, USA. zkronenberg@pacificbiosciences.com.
  • Rhie A; Pacific Biosciences, Menlo Park, CA, USA. zkronenberg@pacificbiosciences.com.
  • Koren S; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.
  • Concepcion GT; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.
  • Peluso P; Pacific Biosciences, Menlo Park, CA, USA.
  • Munson KM; Pacific Biosciences, Menlo Park, CA, USA.
  • Porubsky D; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Kuhn K; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Mueller KA; US Meat Animal Research Center, ARS USDA, Clay Center, NE, USA.
  • Low WY; Phase Genomics, Seattle, WA, USA.
  • Hiendleder S; Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy, SA, Australia.
  • Fedrigo O; Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy, SA, Australia.
  • Liachko I; Vertebrate Genomes Laboratory, The Rockefeller University, New York, NY, USA.
  • Hall RJ; Phase Genomics, Seattle, WA, USA.
  • Phillippy AM; Pacific Biosciences, Menlo Park, CA, USA.
  • Eichler EE; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.
  • Williams JL; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Smith TPL; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
  • Jarvis ED; Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy, SA, Australia.
  • Sullivan ST; Dipartimento di Scienze Animali, della Nutrizione e degli Alimenti, Università Cattolica del Sacro Cuore, 29122, Piacenza, Italy.
  • Kingan SB; US Meat Animal Research Center, ARS USDA, Clay Center, NE, USA.
Nat Commun ; 12(1): 1935, 2021 04 28.
Article em En | MEDLINE | ID: mdl-33911078
Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80-91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genoma Humano / Análise de Sequência de DNA / Mapeamento de Sequências Contíguas / Sequenciamento de Nucleotídeos em Larga Escala Limite: Animals / Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genoma Humano / Análise de Sequência de DNA / Mapeamento de Sequências Contíguas / Sequenciamento de Nucleotídeos em Larga Escala Limite: Animals / Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article