Your browser doesn't support javascript.
loading
Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.
Schneider, Valerie A; Graves-Lindsay, Tina; Howe, Kerstin; Bouk, Nathan; Chen, Hsiu-Chuan; Kitts, Paul A; Murphy, Terence D; Pruitt, Kim D; Thibaud-Nissen, Françoise; Albracht, Derek; Fulton, Robert S; Kremitzki, Milinn; Magrini, Vincent; Markovic, Chris; McGrath, Sean; Steinberg, Karyn Meltz; Auger, Kate; Chow, William; Collins, Joanna; Harden, Glenn; Hubbard, Timothy; Pelan, Sarah; Simpson, Jared T; Threadgold, Glen; Torrance, James; Wood, Jonathan M; Clarke, Laura; Koren, Sergey; Boitano, Matthew; Peluso, Paul; Li, Heng; Chin, Chen-Shan; Phillippy, Adam M; Durbin, Richard; Wilson, Richard K; Flicek, Paul; Eichler, Evan E; Church, Deanna M.
Afiliación
  • Schneider VA; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Graves-Lindsay T; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Howe K; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Bouk N; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Chen HC; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Kitts PA; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Murphy TD; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Pruitt KD; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Thibaud-Nissen F; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Albracht D; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Fulton RS; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Kremitzki M; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Magrini V; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Markovic C; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • McGrath S; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Steinberg KM; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Auger K; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Chow W; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Collins J; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Harden G; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Hubbard T; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Pelan S; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Simpson JT; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Threadgold G; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Torrance J; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Wood JM; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Clarke L; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.
  • Koren S; National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.
  • Boitano M; Pacific Biosciences, Menlo Park, California 94025, USA.
  • Peluso P; Pacific Biosciences, Menlo Park, California 94025, USA.
  • Li H; Broad Institute, Cambridge, Massachusetts 02142, USA.
  • Chin CS; Pacific Biosciences, Menlo Park, California 94025, USA.
  • Phillippy AM; National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.
  • Durbin R; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
  • Wilson RK; McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA.
  • Flicek P; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.
  • Eichler EE; Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.
  • Church DM; Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA.
Genome Res ; 27(5): 849-864, 2017 05.
Article en En | MEDLINE | ID: mdl-28396521
ABSTRACT
The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.
Asunto(s)

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Programas Informáticos / Genoma Humano / Análisis de Secuencia de ADN / Mapeo Contig / Genómica Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Genome Res Asunto de la revista: BIOLOGIA MOLECULAR / GENETICA Año: 2017 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Programas Informáticos / Genoma Humano / Análisis de Secuencia de ADN / Mapeo Contig / Genómica Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Genome Res Asunto de la revista: BIOLOGIA MOLECULAR / GENETICA Año: 2017 Tipo del documento: Article País de afiliación: Estados Unidos