Your browser doesn't support javascript.
loading
Structural polymorphism and diversity of human segmental duplications.
Jeong, Hyeonsoo; Dishuck, Philip C; Yoo, DongAhn; Harvey, William T; Munson, Katherine M; Lewis, Alexandra P; Kordosky, Jennifer; Garcia, Gage H; Yilmaz, Feyza; Hallast, Pille; Lee, Charles; Pastinen, Tomi; Eichler, Evan E.
Afiliación
  • Jeong H; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Dishuck PC; Altos Labs, San Diego, CA, USA.
  • Yoo D; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Harvey WT; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Munson KM; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Lewis AP; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Kordosky J; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Garcia GH; Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
  • Hallast P; The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
  • Lee C; The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
  • Pastinen T; The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
  • Eichler EE; Children's Mercy Hospital and University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA.
bioRxiv ; 2024 Jun 06.
Article en En | MEDLINE | ID: mdl-38895457
ABSTRACT
Segmental duplications (SDs) contribute significantly to human disease, evolution, and diversity yet have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies where the majority of SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms, we identify 173.2 Mbp of duplicated sequence (47.4 Mbp not present in the telomere-to-telomere reference) distinguishing fixed from structurally polymorphic events. We find that intrachromosomal SDs are among the most variable with rare events mapping near their progenitor sequences. African genomes harbor significantly more intrachromosomal SDs and are more likely to have recently duplicated gene families with higher copy number when compared to non-African samples. A comparison to a resource of 563 million full-length Iso-Seq reads identifies 201 novel, potentially protein-coding genes corresponding to these copy number polymorphic SDs.