RESUMO
De novo copy number variants (dnCNVs) arising at multiple loci in a personal genome have usually been considered to reflect cancer somatic genomic instabilities. We describe a multiple dnCNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional dnCNVs. These CNVs originate from independent formation incidences, are predominantly tandem duplications or complex gains, exhibit breakpoint junction features reminiscent of replicative repair, and show increased de novo point mutations flanking the rearrangement junctions. The active CNV mutation shower appears to be restricted to a transient perizygotic period. We propose that a defect in the CNV formation process is responsible for the "CNV-mutator state," and this state is dampened after early embryogenesis. The constitutional MdnCNV phenomenon resembles chromosomal instability in various cancers. Investigations of this phenomenon may provide unique access to understanding genomic disorders, structural variant mutagenesis, human evolution, and cancer biology.
Assuntos
Aberrações Cromossômicas , Variações do Número de Cópias de DNA , Doenças Genéticas Inatas/embriologia , Doenças Genéticas Inatas/genética , Instabilidade Genômica , Mutação , Pontos de Quebra do Cromossomo , Duplicação Cromossômica , Replicação do DNA , Desenvolvimento Embrionário , Feminino , Gametogênese , Humanos , MasculinoRESUMO
Detection of structural variants (SVs) is currently biased toward those that alter copy number. The relative contribution of inversions toward genetic disease is unclear. In this study, we analyzed genome sequencing data for 33,924 families with rare disease from the 100,000 Genomes Project. From a database hosting >500 million SVs, we focused on 351 genes where haploinsufficiency is a confirmed disease mechanism and identified 47 ultra-rare rearrangements that included an inversion (24 bp to 36.4 Mb, 20/47 de novo). Validation utilized a number of orthogonal approaches, including retrospective exome analysis. RNA-seq data supported the respective diagnoses for six participants. Phenotypic blending was apparent in four probands. Diagnostic odysseys were a common theme (>50 years for one individual), and targeted analysis for the specific gene had already been performed for 30% of these individuals but with no findings. We provide formal confirmation of a European founder origin for an intragenic MSH2 inversion. For two individuals with complex SVs involving the MECP2 mutational hotspot, ambiguous SV structures were resolved using long-read sequencing, influencing clinical interpretation. A de novo inversion of HOXD11-13 was uncovered in a family with Kantaputra-type mesomelic dysplasia. Lastly, a complex translocation disrupting APC and involving nine rearranged segments confirmed a clinical diagnosis for three family members and resolved a conundrum for a sibling with a single polyp. Overall, inversions play a small but notable role in rare disease, likely explaining the etiology in around 1/750 families across heterogeneous clinical cohorts.
Assuntos
Inversão Cromossômica , Doenças Raras , Humanos , Doenças Raras/genética , Masculino , Feminino , Inversão Cromossômica/genética , Linhagem , Genoma Humano , Sequenciamento Completo do Genoma , Proteína 2 de Ligação a Metil-CpG/genética , Mutação , Proteínas de Homeodomínio/genética , Pessoa de Meia-IdadeRESUMO
BACKGROUND: Pediatric disorders include a range of highly penetrant, genetically heterogeneous conditions amenable to genomewide diagnostic approaches. Finding a molecular diagnosis is challenging but can have profound lifelong benefits. METHODS: We conducted a large-scale sequencing study involving more than 13,500 families with probands with severe, probably monogenic, difficult-to-diagnose developmental disorders from 24 regional genetics services in the United Kingdom and Ireland. Standardized phenotypic data were collected, and exome sequencing and microarray analyses were performed to investigate novel genetic causes. We developed an iterative variant analysis pipeline and reported candidate variants to clinical teams for validation and diagnostic interpretation to inform communication with families. Multiple regression analyses were performed to evaluate factors affecting the probability of diagnosis. RESULTS: A total of 13,449 probands were included in the analyses. On average, we reported 1.0 candidate variant per parent-offspring trio and 2.5 variants per singleton proband. Using clinical and computational approaches to variant classification, we made a diagnosis in approximately 41% of probands (5502 of 13,449). Of 3599 probands in trios who received a diagnosis by clinical assertion, approximately 76% had a pathogenic de novo variant. Another 22% of probands (2997 of 13,449) had variants of uncertain significance in genes that were strongly linked to monogenic developmental disorders. Recruitment in a parent-offspring trio had the largest effect on the probability of diagnosis (odds ratio, 4.70; 95% confidence interval [CI], 4.16 to 5.31). Probands were less likely to receive a diagnosis if they were born extremely prematurely (i.e., 22 to 27 weeks' gestation; odds ratio, 0.39; 95% CI, 0.22 to 0.68), had in utero exposure to antiepileptic medications (odds ratio, 0.44; 95% CI, 0.29 to 0.67), had mothers with diabetes (odds ratio, 0.52; 95% CI, 0.41 to 0.67), or were of African ancestry (odds ratio, 0.51; 95% CI, 0.31 to 0.78). CONCLUSIONS: Among probands with severe, probably monogenic, difficult-to-diagnose developmental disorders, multimodal analysis of genomewide data had good diagnostic power, even after previous attempts at diagnosis. (Funded by the Health Innovation Challenge Fund and Wellcome Sanger Institute.).
Assuntos
Genômica , Doenças Raras , Criança , Humanos , Exoma , Irlanda/epidemiologia , Reino Unido/epidemiologia , Doenças Raras/diagnóstico , Doenças Raras/epidemiologia , Doenças Raras/genética , Análise de Sequência com Séries de Oligonucleotídeos , Estudos de Associação Genética , Transtornos do Neurodesenvolvimento/diagnóstico , Transtornos do Neurodesenvolvimento/genética , Anormalidades Congênitas/diagnóstico , Anormalidades Congênitas/genética , Transtornos do Crescimento/diagnóstico , Transtornos do Crescimento/genética , Fácies , Transtornos do Comportamento Infantil/diagnóstico , Transtornos do Comportamento Infantil/genética , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genéticaRESUMO
Structural variation (SV) describes a broad class of genetic variation greater than 50 bp in size. SVs can cause a wide range of genetic diseases and are prevalent in rare developmental disorders (DDs). Individuals presenting with DDs are often referred for diagnostic testing with chromosomal microarrays (CMAs) to identify large copy-number variants (CNVs) and/or with single-gene, gene-panel, or exome sequencing (ES) to identify single-nucleotide variants, small insertions/deletions, and CNVs. However, individuals with pathogenic SVs undetectable by conventional analysis often remain undiagnosed. Consequently, we have developed the tool InDelible, which interrogates short-read sequencing data for split-read clusters characteristic of SV breakpoints. We applied InDelible to 13,438 probands with severe DDs recruited as part of the Deciphering Developmental Disorders (DDD) study and discovered 63 rare, damaging variants in genes previously associated with DDs missed by standard SNV, indel, or CNV discovery approaches. Clinical review of these 63 variants determined that about half (30/63) were plausibly pathogenic. InDelible was particularly effective at ascertaining variants between 21 and 500 bp in size and increased the total number of potentially pathogenic variants identified by DDD in this size range by 42.9%. Of particular interest were seven confirmed de novo variants in MECP2, which represent 35.0% of all de novo protein-truncating variants in MECP2 among DDD study participants. InDelible provides a framework for the discovery of pathogenic SVs that are most likely missed by standard analytical workflows and has the potential to improve the diagnostic yield of ES across a broad range of genetic diseases.
Assuntos
Deficiências do Desenvolvimento/diagnóstico , Deficiências do Desenvolvimento/genética , Sequenciamento do Exoma/métodos , Criança , Feminino , Humanos , Masculino , Proteína 2 de Ligação a Metil-CpG/genéticaRESUMO
We investigated complex genomic rearrangements (CGRs) consisting of triplication copy-number variants (CNVs) that were accompanied by extended regions of copy-number-neutral absence of heterozygosity (AOH) in subjects with multiple congenital abnormalities. Molecular analyses provided observational evidence that in humans, post-zygotically generated CGRs can lead to regional uniparental disomy (UPD) due to template switches between homologs versus sister chromatids by using microhomology to prime DNA replication-a prediction of the replicative repair model, MMBIR. Our findings suggest that replication-based mechanisms might underlie the formation of diverse types of genomic alterations (CGRs and AOH) implicated in constitutional disorders.
Assuntos
Variações do Número de Cópias de DNA/genética , Reparo do DNA/genética , Replicação do DNA/genética , Rearranjo Gênico/genética , Perda de Heterozigosidade/genética , Modelos Genéticos , Dissomia Uniparental/genética , Sequência de Bases , Humanos , Hibridização in Situ Fluorescente , Dados de Sequência Molecular , Países Baixos , Reação em Cadeia da Polimerase , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNARESUMO
Nonallelic homologous recombination (NAHR) between highly similar duplicated sequences generates chromosomal deletions, duplications and inversions, which can cause diverse genetic disorders. Little is known about interindividual variation in NAHR rates and the factors that influence this. We estimated the rate of deletion at the CMT1A-REP NAHR hotspot in sperm DNA from 34 male donors, including 16 monozygotic (MZ) co-twins (8 twin pairs) aged 24 to 67 years old. The average NAHR rate was 3.5 × 10(-5) with a seven-fold variation across individuals. Despite good statistical power to detect even a subtle correlation, we observed no relationship between age of unrelated individuals and the rate of NAHR in their sperm, likely reflecting the meiotic-specific origin of these events. We then estimated the heritability of deletion rate by calculating the intraclass correlation (ICC) within MZ co-twins, revealing a significant correlation between MZ co-twins (ICC = 0.784, p = 0.0039), with MZ co-twins being significantly more correlated than unrelated pairs. We showed that this heritability cannot be explained by variation in PRDM9, a known regulator of NAHR, or variation within the NAHR hotspot itself. We also did not detect any correlation between Body Mass Index (BMI), smoking status or alcohol intake and rate of NAHR. Our results suggest that other, as yet unidentified, genetic or environmental factors play a significant role in the regulation of NAHR and are responsible for the extensive variation in the population for the probability of fathering a child with a genomic disorder resulting from a pathogenic deletion.
Assuntos
Recombinação Homóloga/genética , Neurofibromatose 1/genética , Gêmeos Monozigóticos/genética , Adulto , Idoso , Alelos , Deleção Cromossômica , Duplicação Gênica , Humanos , Mutação INDEL/genética , Masculino , Pessoa de Meia-Idade , Deleção de Sequência/genética , EspermatozoidesRESUMO
Autosomal recessive coding variants are well-known causes of rare disorders. We quantified the contribution of these variants to developmental disorders in a large, ancestrally diverse cohort comprising 29,745 trios, of whom 20.4% had genetically inferred non-European ancestries. The estimated fraction of patients attributable to exome-wide autosomal recessive coding variants ranged from ~2-19% across genetically inferred ancestry groups and was significantly correlated with average autozygosity. Established autosomal recessive developmental disorder-associated (ARDD) genes explained 84.0% of the total autosomal recessive coding burden, and 34.4% of the burden in these established genes was explained by variants not already reported as pathogenic in ClinVar. Statistical analyses identified two novel ARDD genes: KBTBD2 and ZDHHC16. This study expands our understanding of the genetic architecture of developmental disorders across diverse genetically inferred ancestry groups and suggests that improving strategies for interpreting missense variants in known ARDD genes may help diagnose more patients than discovering the remaining genes.
RESUMO
Whole genome sequencing (WGS) studies have estimated the human germline mutation rate per basepair per generation (~1.2 × 10-8) to be higher than in mice (3.5-5.4 × 10-9). In humans, most germline mutations are paternal in origin and numbers of mutations per offspring increase with paternal and maternal age. Here we estimate germline mutation rates and spectra in six multi-sibling mouse pedigrees and compare to three multi-sibling human pedigrees. In both species we observe a paternal mutation bias, a parental age effect, and a highly mutagenic first cell division contributing to the embryo. We also observe differences between species in mutation spectra, in mutation rates per cell division, and in the parental bias of mutations in early embryogenesis. These differences between species likely result from both species-specific differences in cellular genealogies of the germline, as well as biological differences within the same stage of embryogenesis or gametogenesis.
Assuntos
Células Germinativas/metabolismo , Mutação em Linhagem Germinativa , Taxa de Mutação , Sequenciamento Completo do Genoma/métodos , Animais , Divisão Celular/genética , Desenvolvimento Embrionário/genética , Feminino , Gametogênese/genética , Células Germinativas/citologia , Humanos , Masculino , Idade Materna , Camundongos , Idade Paterna , Linhagem , Especificidade da EspécieRESUMO
Understanding the effects of environmental exposures on germline mutation rates has been a decades-long pursuit in genetics. We used next-generation sequencing and comparative genomic hybridization arrays to investigate genome-wide mutations in the offspring of male mice exposed to benzo(a)pyrene (BaP), a common environmental pollutant. We demonstrate that offspring developing from sperm exposed during the mitotic or post-mitotic phases of spermatogenesis have significantly more de novo single nucleotide variants (1.8-fold; P < 0.01) than controls. Both phases of spermatogenesis are susceptible to the induction of heritable mutations, although mutations arising from post-fertilization events are more common after post-mitotic exposure. In addition, the mutation spectra in sperm and offspring of BaP-exposed males are consistent. Finally, we report a significant increase in transmitted copy number duplications (P = 0.001) in BaP-exposed sires. Our study demonstrates that germ cell mutagen exposures induce genome-wide mutations in the offspring that may be associated with adverse health outcomes.
Assuntos
Benzo(a)pireno/efeitos adversos , Poluentes Ambientais/efeitos adversos , Mutagênicos/efeitos adversos , Mutação , Exposição Paterna , Espermatozoides/efeitos dos fármacos , Animais , Variações do Número de Cópias de DNA , Exposição Ambiental , Feminino , Masculino , Camundongos Endogâmicos C57BL , Mitose/efeitos dos fármacos , Mitose/genética , Espermatogênese/efeitos dos fármacos , Espermatogênese/genéticaRESUMO
Haplotypic sequences contain significantly more information than genotypes of genetic markers and are critical for studying disease association and genome evolution. Current methods for obtaining haplotypic sequences require the physical separation of alleles before sequencing, are time consuming and are not scaleable for large surveys of genetic variation. We have developed a novel method for acquiring haplotypic sequences from long PCR products using simple, high-throughput techniques. This method applies modified shotgun sequencing protocols to sequence both alleles concurrently, with read-pair information allowing the two alleles to be separated during sequence assembly. Although the haplotypic sequences can be assembled manually from the resultant data using pre-existing sequence assembly software, we have devised a novel heuristic algorithm to automate assembly and remove human error. We validated the approach on two long PCR products amplified from the human genome and confirmed the accuracy of our sequences against full-length clones of the same alleles. This method presents a simple high-throughput means to obtain full haplotypic sequences potentially up to 20 kb in length and is suitable for surveying genetic variation even in poorly-characterized genomes as it requires no prior information on sequence variation.
Assuntos
Alelos , Triagem de Portadores Genéticos/métodos , Variação Genética , Análise de Sequência de DNA/métodos , Algoritmos , Sequência de Bases , Haplótipos , Humanos , Masculino , Dados de Sequência Molecular , Reação em Cadeia da PolimeraseRESUMO
Germline mutations are a driving force behind genome evolution and genetic disease. We investigated genome-wide mutation rates and spectra in multi-sibling families. The mutation rate increased with paternal age in all families, but the number of additional mutations per year differed by more than twofold between families. Meta-analysis of 6,570 mutations showed that germline methylation influences mutation rates. In contrast to somatic mutations, we found remarkable consistency in germline mutation spectra between the sexes and at different paternal ages. In parental germ line, 3.8% of mutations were mosaic, resulting in 1.3% of mutations being shared by siblings. The number of these shared mutations varied significantly between families. Our data suggest that the mutation rate per cell division is higher during both early embryogenesis and differentiation of primordial germ cells but is reduced substantially during post-pubertal spermatogenesis. These findings have important consequences for the recurrence risks of disorders caused by de novo mutations.
Assuntos
Mutação em Linhagem Germinativa , Ilhas de CpG , Feminino , Humanos , Masculino , Mosaicismo , Idade Paterna , LinhagemRESUMO
The ability to predict the genetic consequences of human exposure to ionizing radiation has been a long-standing goal of human genetics in the past 50 years. Here we present the results of an unbiased, comprehensive genome-wide survey of the range of germline mutations induced in laboratory mice after parental exposure to ionizing radiation and show irradiation markedly alters the frequency and spectrum of de novo mutations. Here we show that the frequency of de novo copy number variants (CNVs) and insertion/deletion events (indels) is significantly elevated in offspring of exposed fathers. We also show that the spectrum of induced de novo single-nucleotide variants (SNVs) is strikingly different; with clustered mutations being significantly over-represented in the offspring of irradiated males. Our study highlights the specific classes of radiation-induced DNA lesions that evade repair and result in germline mutation and paves the way for similarly comprehensive characterizations of other germline mutagens.
Assuntos
Variações do Número de Cópias de DNA/efeitos da radiação , DNA/efeitos da radiação , Genoma/efeitos da radiação , Células Germinativas/efeitos da radiação , Mutação em Linhagem Germinativa/efeitos da radiação , Radiação Ionizante , Animais , Feminino , Genoma/genética , Mutação em Linhagem Germinativa/genética , Masculino , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de DNA , EspermatogêneseRESUMO
J.B.S. Haldane proposed in 1947 that the male germline may be more mutagenic than the female germline. Diverse studies have supported Haldane's contention of a higher average mutation rate in the male germline in a variety of mammals, including humans. Here we present, to our knowledge, the first direct comparative analysis of male and female germline mutation rates from the complete genome sequences of two parent-offspring trios. Through extensive validation, we identified 49 and 35 germline de novo mutations (DNMs) in two trio offspring, as well as 1,586 non-germline DNMs arising either somatically or in the cell lines from which the DNA was derived. Most strikingly, in one family, we observed that 92% of germline DNMs were from the paternal germline, whereas, in contrast, in the other family, 64% of DNMs were from the maternal germline. These observations suggest considerable variation in mutation rates within and between families.
Assuntos
Família , Variação Genética , Genoma Humano , Mutação em Linhagem Germinativa/genética , Mapeamento Cromossômico , Análise Mutacional de DNA , Feminino , Humanos , Masculino , Reação em Cadeia da PolimeraseRESUMO
Insights into the origins of structural variation and the mutational mechanisms underlying genomic disorders would be greatly improved by a genomewide map of hotspots of nonallelic homologous recombination (NAHR). Moreover, our understanding of sequence variation within the duplicated sequences that are substrates for NAHR lags far behind that of sequence variation within the single-copy portion of the genome. Perhaps the best-characterized NAHR hotspot lies within the 24-kb-long Charcot-Marie-Tooth disease type 1A (CMT1A)-repeats (REPs) that sponsor deletions and duplications that cause peripheral neuropathies. We investigated structural and sequence diversity within the CMT1A-REPs, both within and between species. We discovered a high frequency of retroelement insertions, accelerated sequence evolution after duplication, extensive paralogous gene conversion, and a greater than twofold enrichment of SNPs in humans relative to the genome average. We identified an allelic recombination hotspot underlying the known NAHR hotspot, which suggests that the two processes are intimately related. Finally, we used our data to develop a novel method for inferring the location of an NAHR hotspot from sequence variation within segmental duplications and applied it to identify a putative NAHR hotspot within the LCR22 repeats that sponsor velocardiofacial syndrome deletions. We propose that a large-scale project to map sequence variation within segmental duplications would reveal a wealth of novel chromosomal-rearrangement hotspots.