RESUMEN
Human limbs emerge during the fourth post-conception week as mesenchymal buds, which develop into fully formed limbs over the subsequent months1. This process is orchestrated by numerous temporally and spatially restricted gene expression programmes, making congenital alterations in phenotype common2. Decades of work with model organisms have defined the fundamental mechanisms underlying vertebrate limb development, but an in-depth characterization of this process in humans has yet to be performed. Here we detail human embryonic limb development across space and time using single-cell and spatial transcriptomics. We demonstrate extensive diversification of cells from a few multipotent progenitors to myriad differentiated cell states, including several novel cell populations. We uncover two waves of human muscle development, each characterized by different cell states regulated by separate gene expression programmes, and identify musculin (MSC) as a key transcriptional repressor maintaining muscle stem cell identity. Through assembly of multiple anatomically continuous spatial transcriptomic samples using VisiumStitcher, we map cells across a sagittal section of a whole fetal hindlimb. We reveal a clear anatomical segregation between genes linked to brachydactyly and polysyndactyly, and uncover transcriptionally and spatially distinct populations of the mesenchyme in the autopod. Finally, we perform single-cell RNA sequencing on mouse embryonic limbs to facilitate cross-species developmental comparison, finding substantial homology between the two species.
RESUMEN
BACKGROUND: Pediatric disorders include a range of highly penetrant, genetically heterogeneous conditions amenable to genomewide diagnostic approaches. Finding a molecular diagnosis is challenging but can have profound lifelong benefits. METHODS: We conducted a large-scale sequencing study involving more than 13,500 families with probands with severe, probably monogenic, difficult-to-diagnose developmental disorders from 24 regional genetics services in the United Kingdom and Ireland. Standardized phenotypic data were collected, and exome sequencing and microarray analyses were performed to investigate novel genetic causes. We developed an iterative variant analysis pipeline and reported candidate variants to clinical teams for validation and diagnostic interpretation to inform communication with families. Multiple regression analyses were performed to evaluate factors affecting the probability of diagnosis. RESULTS: A total of 13,449 probands were included in the analyses. On average, we reported 1.0 candidate variant per parent-offspring trio and 2.5 variants per singleton proband. Using clinical and computational approaches to variant classification, we made a diagnosis in approximately 41% of probands (5502 of 13,449). Of 3599 probands in trios who received a diagnosis by clinical assertion, approximately 76% had a pathogenic de novo variant. Another 22% of probands (2997 of 13,449) had variants of uncertain significance in genes that were strongly linked to monogenic developmental disorders. Recruitment in a parent-offspring trio had the largest effect on the probability of diagnosis (odds ratio, 4.70; 95% confidence interval [CI], 4.16 to 5.31). Probands were less likely to receive a diagnosis if they were born extremely prematurely (i.e., 22 to 27 weeks' gestation; odds ratio, 0.39; 95% CI, 0.22 to 0.68), had in utero exposure to antiepileptic medications (odds ratio, 0.44; 95% CI, 0.29 to 0.67), had mothers with diabetes (odds ratio, 0.52; 95% CI, 0.41 to 0.67), or were of African ancestry (odds ratio, 0.51; 95% CI, 0.31 to 0.78). CONCLUSIONS: Among probands with severe, probably monogenic, difficult-to-diagnose developmental disorders, multimodal analysis of genomewide data had good diagnostic power, even after previous attempts at diagnosis. (Funded by the Health Innovation Challenge Fund and Wellcome Sanger Institute.).
Asunto(s)
Genómica , Enfermedades Raras , Niño , Humanos , Exoma , Irlanda/epidemiología , Reino Unido/epidemiología , Enfermedades Raras/diagnóstico , Enfermedades Raras/epidemiología , Enfermedades Raras/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Estudios de Asociación Genética , Trastornos del Neurodesarrollo/diagnóstico , Trastornos del Neurodesarrollo/genética , Anomalías Congénitas/diagnóstico , Anomalías Congénitas/genética , Trastornos del Crecimiento/diagnóstico , Trastornos del Crecimiento/genética , Facies , Trastornos de la Conducta Infantil/diagnóstico , Trastornos de la Conducta Infantil/genética , Enfermedades Genéticas Congénitas/diagnóstico , Enfermedades Genéticas Congénitas/genéticaRESUMEN
Specification of the eye field (EF) within the neural plate marks the earliest detectable stage of eye development. Experimental evidence, primarily from non-mammalian model systems, indicates that the stable formation of this group of cells requires the activation of a set of key transcription factors. This crucial event is challenging to probe in mammals and, quantitatively, little is known regarding the regulation of the transition of cells to this ocular fate. Using optic vesicle organoids to model the onset of the EF, we generate time-course transcriptomic data allowing us to identify dynamic gene expression programmes that characterize this cellular-state transition. Integrating this with chromatin accessibility data suggests a direct role of canonical EF transcription factors in regulating these gene expression changes, and highlights candidate cis-regulatory elements through which these transcription factors act. Finally, we begin to test a subset of these candidate enhancer elements, within the organoid system, by perturbing the underlying DNA sequence and measuring transcriptomic changes during EF activation.
Asunto(s)
Ojo , Factores de Transcripción , Animales , Ojo/metabolismo , Factores de Transcripción/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos , Secuencia de Bases , Organoides/metabolismo , Regulación del Desarrollo de la Expresión Génica , Mamíferos/genéticaRESUMEN
De novo mutations in protein-coding genes are a well-established cause of developmental disorders1. However, genes known to be associated with developmental disorders account for only a minority of the observed excess of such de novo mutations1,2. Here, to identify previously undescribed genes associated with developmental disorders, we integrate healthcare and research exome-sequence data from 31,058 parent-offspring trios of individuals with developmental disorders, and develop a simulation-based statistical test to identify gene-specific enrichment of de novo mutations. We identified 285 genes that were significantly associated with developmental disorders, including 28 that had not previously been robustly associated with developmental disorders. Although we detected more genes associated with developmental disorders, much of the excess of de novo mutations in protein-coding genes remains unaccounted for. Modelling suggests that more than 1,000 genes associated with developmental disorders have not yet been described, many of which are likely to be less penetrant than the currently known genes. Research access to clinical diagnostic datasets will be critical for completing the map of genes associated with developmental disorders.
Asunto(s)
Análisis Mutacional de ADN , Análisis de Datos , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Atención a la Salud/estadística & datos numéricos , Discapacidades del Desarrollo/genética , Enfermedades Genéticas Congénitas/genética , Estudios de Cohortes , Variaciones en el Número de Copia de ADN/genética , Discapacidades del Desarrollo/diagnóstico , Europa (Continente) , Femenino , Enfermedades Genéticas Congénitas/diagnóstico , Mutación de Línea Germinal/genética , Haploinsuficiencia/genética , Humanos , Masculino , Mutación Missense/genética , Penetrancia , Muerte Perinatal , Tamaño de la MuestraRESUMEN
Nonsense and missense mutations in the transcription factor PAX6 cause a wide range of eye development defects, including aniridia, microphthalmia and coloboma. To understand how changes of PAX6:DNA binding cause these phenotypes, we combined saturation mutagenesis of the paired domain of PAX6 with a yeast one-hybrid (Y1H) assay in which expression of a PAX6-GAL4 fusion gene drives antibiotic resistance. We quantified binding of more than 2700 single amino-acid variants to two DNA sequence elements. Mutations in DNA-facing residues of the N-terminal subdomain and linker region were most detrimental, as were mutations to prolines and to negatively charged residues. Many variants caused sequence-specific molecular gain-of-function effects, including variants in position 71 that increased binding to the LE9 enhancer but decreased binding to a SELEX-derived binding site. In the absence of antibiotic selection, variants that retained DNA binding slowed yeast growth, likely because such variants perturbed the yeast transcriptome. Benchmarking against known patient variants and applying ACMG/AMP guidelines to variant classification, we obtained supporting-to-moderate evidence that 977 variants are likely pathogenic and 1306 are likely benign. Our analysis shows that most pathogenic mutations in the paired domain of PAX6 can be explained simply by the effects of these mutations on PAX6:DNA association, and establishes Y1H as a generalisable assay for the interpretation of variant effects in transcription factors.
Asunto(s)
ADN , Factor de Transcripción PAX6 , Factor de Transcripción PAX6/genética , Factor de Transcripción PAX6/metabolismo , Humanos , ADN/genética , ADN/metabolismo , Sitios de Unión , Unión Proteica , Mutación , Técnicas del Sistema de Dos Híbridos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Mutación Missense , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Análisis Mutacional de ADNRESUMEN
BACKGROUND: Classic aniridia is a highly penetrant autosomal dominant disorder characterised by congenital absence of the iris, foveal hypoplasia, optic disc anomalies and progressive opacification of the cornea. >90% of cases of classic aniridia are caused by heterozygous, loss-of-function variants affecting the PAX6 locus. METHODS: Short-read whole genome sequencing was performed on 51 (39 affected) individuals from 37 different families who had screened negative for mutations in the PAX6 coding region. RESULTS: Likely causative mutations were identified in 22 out of 37 (59%) families. In 19 out of 22 families, the causative genomic changes have an interpretable deleterious impact on the PAX6 locus. Of these 19 families, 1 has a novel heterozygous PAX6 frameshift variant missed on previous screens, 4 have single nucleotide variants (SNVs) (one novel) affecting essential splice sites of PAX6 5' non-coding exons and 2 have deep intronic SNV (one novel) resulting in gain of a donor splice site. In 12 out of 19, the causative variants are large-scale structural variants; 5 have partial or whole gene deletions of PAX6, 3 have deletions encompassing critical PAX6 cis-regulatory elements, 2 have balanced inversions with disruptive breakpoints within the PAX6 locus and 2 have complex rearrangements disrupting PAX6. The remaining 3 of 22 families have deletions encompassing FOXC1 (a known cause of atypical aniridia). Seven of the causative variants occurred de novo and one cosegregated with familial aniridia. We were unable to establish inheritance status in the remaining probands. No plausibly causative SNVs were identified in PAX6 cis-regulatory elements. CONCLUSION: Whole genome sequencing proves to be an effective diagnostic test in most individuals with previously unexplained aniridia.
Asunto(s)
Aniridia , Anomalías del Ojo , Humanos , Factor de Transcripción PAX6/genética , Aniridia/genética , Mutación/genética , Anomalías del Ojo/genética , Exones , Proteínas de Homeodominio/genética , Proteínas del Ojo/genética , LinajeRESUMEN
Structural variation (SV) describes a broad class of genetic variation greater than 50 bp in size. SVs can cause a wide range of genetic diseases and are prevalent in rare developmental disorders (DDs). Individuals presenting with DDs are often referred for diagnostic testing with chromosomal microarrays (CMAs) to identify large copy-number variants (CNVs) and/or with single-gene, gene-panel, or exome sequencing (ES) to identify single-nucleotide variants, small insertions/deletions, and CNVs. However, individuals with pathogenic SVs undetectable by conventional analysis often remain undiagnosed. Consequently, we have developed the tool InDelible, which interrogates short-read sequencing data for split-read clusters characteristic of SV breakpoints. We applied InDelible to 13,438 probands with severe DDs recruited as part of the Deciphering Developmental Disorders (DDD) study and discovered 63 rare, damaging variants in genes previously associated with DDs missed by standard SNV, indel, or CNV discovery approaches. Clinical review of these 63 variants determined that about half (30/63) were plausibly pathogenic. InDelible was particularly effective at ascertaining variants between 21 and 500 bp in size and increased the total number of potentially pathogenic variants identified by DDD in this size range by 42.9%. Of particular interest were seven confirmed de novo variants in MECP2, which represent 35.0% of all de novo protein-truncating variants in MECP2 among DDD study participants. InDelible provides a framework for the discovery of pathogenic SVs that are most likely missed by standard analytical workflows and has the potential to improve the diagnostic yield of ES across a broad range of genetic diseases.
Asunto(s)
Discapacidades del Desarrollo/diagnóstico , Discapacidades del Desarrollo/genética , Secuenciación del Exoma/métodos , Niño , Femenino , Humanos , Masculino , Proteína 2 de Unión a Metil-CpG/genéticaRESUMEN
Mutation in the germline is the ultimate source of genetic variation, but little is known about the influence of germline chromatin structure on mutational processes. Using ATAC-seq, we profile the open chromatin landscape of human spermatogonia, the most proliferative cell type of the germline, identifying transcription factor binding sites (TFBSs) and PRDM9 binding sites, a subset of which will initiate meiotic recombination. We observe an increase in rare structural variant (SV) breakpoints at PRDM9-bound sites, implicating meiotic recombination in the generation of structural variation. Many germline TFBSs, such as NRF1, are also associated with increased rates of SV breakpoints, apparently independent of recombination. Singleton short insertions (≥5 bp) are highly enriched at TFBSs, particularly at sites bound by testis active TFs, and their rates correlate with those of structural variant breakpoints. Short insertions often duplicate the TFBS motif, leading to clustering of motif sites near regulatory regions in this male-driven evolutionary process. Increased mutation loads at germline TFBSs disproportionately affect neural enhancers with activity in spermatogonia, potentially altering neurodevelopmental regulatory architecture. Local chromatin structure in spermatogonia is thus pervasive in shaping both evolution and disease.
Asunto(s)
Genoma Humano , Espermatogonias , Sitios de Unión , Secuenciación de Inmunoprecipitación de Cromatina , N-Metiltransferasa de Histona-Lisina/genética , Humanos , Masculino , Mutación , Espermatogonias/metabolismoRESUMEN
The majority of rare diseases affect children, most of whom have an underlying genetic cause for their condition. However, making a molecular diagnosis with current technologies and knowledge is often still a challenge. Paediatric genomics is an immature but rapidly evolving field that tackles this issue by incorporating next-generation sequencing technologies, especially whole-exome sequencing and whole-genome sequencing, into research and clinical workflows. This complex multidisciplinary approach, coupled with the increasing availability of population genetic variation data, has already resulted in an increased discovery rate of causative genes and in improved diagnosis of rare paediatric disease. Importantly, for affected families, a better understanding of the genetic basis of rare disease translates to more accurate prognosis, management, surveillance and genetic advice; stimulates research into new therapies; and enables provision of better support.
Asunto(s)
Predisposición Genética a la Enfermedad , Variación Genética , Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Enfermedades Raras , Adolescente , Niño , Preescolar , Femenino , Humanos , Lactante , Recién Nacido , Masculino , Enfermedades Raras/diagnóstico , Enfermedades Raras/genéticaRESUMEN
This corrects the article DOI: 10.1038/nrg.2017.116.
RESUMEN
We previously estimated that 42% of patients with severe developmental disorders carry pathogenic de novo mutations in coding sequences. The role of de novo mutations in regulatory elements affecting genes associated with developmental disorders, or other genes, has been essentially unexplored. We identified de novo mutations in three classes of putative regulatory elements in almost 8,000 patients with developmental disorders. Here we show that de novo mutations in highly evolutionarily conserved fetal brain-active elements are significantly and specifically enriched in neurodevelopmental disorders. We identified a significant twofold enrichment of recurrently mutated elements. We estimate that, genome-wide, 1-3% of patients without a diagnostic coding variant carry pathogenic de novo mutations in fetal brain-active regulatory elements and that only 0.15% of all possible mutations within highly conserved fetal brain-active elements cause neurodevelopmental disorders with a dominant mechanism. Our findings represent a robust estimate of the contribution of de novo mutations in regulatory elements to this genetically heterogeneous set of disorders, and emphasize the importance of combining functional and evolutionary evidence to identify regulatory causes of genetic disorders.
Asunto(s)
Mutación , Trastornos del Neurodesarrollo/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Encéfalo/metabolismo , Secuencia Conservada , Discapacidades del Desarrollo/genética , Evolución Molecular , Exoma , Femenino , Feto/metabolismo , Humanos , MasculinoRESUMEN
There are thousands of rare human disorders that are caused by single deleterious, protein-coding genetic variants1. However, patients with the same genetic defect can have different clinical presentations2-4, and some individuals who carry known disease-causing variants can appear unaffected5. Here, to understand what explains these differences, we study a cohort of 6,987 children assessed by clinical geneticists to have severe neurodevelopmental disorders such as global developmental delay and autism, often in combination with abnormalities of other organ systems. Although the genetic causes of these neurodevelopmental disorders are expected to be almost entirely monogenic, we show that 7.7% of variance in risk is attributable to inherited common genetic variation. We replicated this genome-wide common variant burden by showing, in an independent sample of 728 trios (comprising a child plus both parents) from the same cohort, that this burden is over-transmitted from parents to children with neurodevelopmental disorders. Our common-variant signal is significantly positively correlated with genetic predisposition to lower educational attainment, decreased intelligence and risk of schizophrenia. We found that common-variant risk was not significantly different between individuals with and without a known protein-coding diagnostic variant, which suggests that common-variant risk affects patients both with and without a monogenic diagnosis. In addition, previously published common-variant scores for autism, height, birth weight and intracranial volume were all correlated with these traits within our cohort, which suggests that phenotypic expression in individuals with monogenic disorders is affected by the same variants as in the general population. Our results demonstrate that common genetic variation affects both overall risk and clinical presentation in neurodevelopmental disorders that are typically considered to be monogenic.
Asunto(s)
Predisposición Genética a la Enfermedad , Variación Genética , Trastornos del Neurodesarrollo/genética , Enfermedades Raras/genética , Trastorno Autístico/genética , Peso al Nacer/genética , Estatura/genética , Estudios de Casos y Controles , Estudios de Cohortes , Discapacidades del Desarrollo/genética , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Inteligencia/genética , Desequilibrio de Ligamiento , Masculino , Herencia Multifactorial/genética , Fenotipo , Esquizofrenia/genéticaRESUMEN
BACKGROUND: Genomic variant prioritisation is one of the most significant bottlenecks to mainstream genomic testing in healthcare. Tools to improve precision while ensuring high recall are critical to successful mainstream clinical genomic testing, in particular for whole genome sequencing where millions of variants must be considered for each patient. METHODS: We developed EyeG2P, a publicly available database and web application using the Ensembl Variant Effect Predictor. EyeG2P is tailored for efficient variant prioritisation for individuals with inherited ophthalmic conditions. We assessed the sensitivity of EyeG2P in 1234 individuals with a broad range of eye conditions who had previously received a confirmed molecular diagnosis through routine genomic diagnostic approaches. For a prospective cohort of 83 individuals, we assessed the precision of EyeG2P in comparison with routine diagnostic approaches. For 10 additional individuals, we assessed the utility of EyeG2P for whole genome analysis. RESULTS: EyeG2P had 99.5% sensitivity for genomic variants previously identified as clinically relevant through routine diagnostic analysis (n=1234 individuals). Prospectively, EyeG2P enabled a significant increase in precision (35% on average) in comparison with routine testing strategies (p<0.001). We demonstrate that incorporation of EyeG2P into whole genome sequencing analysis strategies can reduce the number of variants for analysis to six variants, on average, while maintaining high diagnostic yield. CONCLUSION: Automated filtering of genomic variants through EyeG2P can increase the efficiency of diagnostic testing for individuals with a broad range of inherited ophthalmic disorders.
Asunto(s)
Bases de Datos Genéticas , Oftalmopatías , Pruebas Genéticas , Genoma Humano , Genómica , Oftalmopatías/genética , Humanos , Variación GenéticaRESUMEN
Our ability to make accurate and specific genetic diagnoses in individuals with severe developmental disorders has been transformed by data derived from genomic sequencing technologies. These data reveal both the patterns and rates of different mutational mechanisms and identify regions of the human genome with fewer mutations than would be expected. In outbred populations, the most common identifiable cause of severe developmental disorders is de novo mutation affecting the coding region in one of approximately 500 different genes, almost universally showing constraint. Simply combining the location of a de novo genomic event with its predicted consequence on the gene product gives significant diagnostic power. Our knowledge of the diversity of phenotypic consequences associated with comparable diagnostic genotypes at each locus is improving. Computationally useful phenotype data will improve diagnostic interpretation of ultrarare genetic variants and, in the long run, indicate which specific embryonic processes have been perturbed.
Asunto(s)
Discapacidades del Desarrollo/diagnóstico , Marcadores Genéticos , Genoma Humano , Genómica/métodos , Mutación , Niño , Discapacidades del Desarrollo/genética , HumanosRESUMEN
Congenital anomalies of the kidney and urinary tract (CAKUT) constitute one of the most frequent birth defects and represent the most common cause of chronic kidney disease in the first three decades of life. Despite the discovery of dozens of monogenic causes of CAKUT, most pathogenic pathways remain elusive. We performed whole-exome sequencing (WES) in 551 individuals with CAKUT and identified a heterozygous de novo stop-gain variant in ZMYM2 in two different families with CAKUT. Through collaboration, we identified in total 14 different heterozygous loss-of-function mutations in ZMYM2 in 15 unrelated families. Most mutations occurred de novo, indicating possible interference with reproductive function. Human disease features are replicated in X. tropicalis larvae with morpholino knockdowns, in which expression of truncated ZMYM2 proteins, based on individual mutations, failed to rescue renal and craniofacial defects. Moreover, heterozygous Zmym2-deficient mice recapitulated features of CAKUT with high penetrance. The ZMYM2 protein is a component of a transcriptional corepressor complex recently linked to the silencing of developmentally regulated endogenous retrovirus elements. Using protein-protein interaction assays, we show that ZMYM2 interacts with additional epigenetic silencing complexes, as well as confirming that it binds to FOXP1, a transcription factor that has also been linked to CAKUT. In summary, our findings establish that loss-of-function mutations of ZMYM2, and potentially that of other proteins in its interactome, as causes of human CAKUT, offering new routes for studying the pathogenesis of the disorder.
Asunto(s)
Proteínas de Unión al ADN/genética , Epigénesis Genética , Factores de Transcripción Forkhead/genética , Mutación , Proteínas Represoras/genética , Factores de Transcripción/genética , Sistema Urinario/metabolismo , Anomalías Urogenitales/genética , Proteínas Anfibias/antagonistas & inhibidores , Proteínas Anfibias/genética , Proteínas Anfibias/metabolismo , Animales , Estudios de Casos y Controles , Niño , Preescolar , Proteínas de Unión al ADN/metabolismo , Familia , Femenino , Factores de Transcripción Forkhead/metabolismo , Heterocigoto , Humanos , Lactante , Larva/genética , Larva/crecimiento & desarrollo , Larva/metabolismo , Masculino , Ratones , Ratones Noqueados , Morfolinos/genética , Morfolinos/metabolismo , Linaje , Unión Proteica , Proteínas Represoras/metabolismo , Factores de Transcripción/metabolismo , Sistema Urinario/anomalías , Anomalías Urogenitales/metabolismo , Anomalías Urogenitales/patología , Secuenciación del Exoma , XenopusRESUMEN
Trio-based whole-exome sequence (WES) data have established confident genetic diagnoses in â¼40% of previously undiagnosed individuals recruited to the Deciphering Developmental Disorders (DDD) study. Here we aim to use the breadth of phenotypic information recorded in DDD to augment diagnosis and disease variant discovery in probands. Median Euclidean distances (mEuD) were employed as a simple measure of similarity of quantitative phenotypic data within sets of ≥10 individuals with plausibly causative de novo mutations (DNM) in 28 different developmental disorder genes. 13/28 (46.4%) showed significant similarity for growth or developmental milestone metrics, 10/28 (35.7%) showed similarity in HPO term usage, and 12/28 (43%) showed no phenotypic similarity. Pairwise comparisons of individuals with high-impact inherited variants to the 32 individuals with causative DNM in ANKRD11 using only growth z-scores highlighted 5 likely causative inherited variants and two unrecognized DNM resulting in an 18% diagnostic uplift for this gene. Using an independent approach, naive Bayes classification of growth and developmental data produced reasonably discriminative models for the 24 DNM genes with sufficiently complete data. An unsupervised naive Bayes classification of 6,993 probands with WES data and sufficient phenotypic information defined 23 in silico syndromes (ISSs) and was used to test a "phenotype first" approach to the discovery of causative genotypes using WES variants strictly filtered on allele frequency, mutation consequence, and evidence of constraint in humans. This highlighted heterozygous de novo nonsynonymous variants in SPTBN2 as causative in three DDD probands.
Asunto(s)
Discapacidades del Desarrollo/genética , Teorema de Bayes , Niño , Enanismo/genética , Exoma/genética , Femenino , Frecuencia de los Genes/genética , Predisposición Genética a la Enfermedad/genética , Heterocigoto , Humanos , Masculino , Mutación/genética , Fenotipo , Proteínas Represoras/genética , Espectrina/genética , Secuenciación del ExomaRESUMEN
Approximately 2% of de novo single-nucleotide variants (SNVs) appear as part of clustered mutations that create multinucleotide variants (MNVs). MNVs are an important source of genomic variability as they are more likely to alter an encoded protein than a SNV, which has important implications in disease as well as evolution. Previous studies of MNVs have focused on their mutational origins and have not systematically evaluated their functional impact and contribution to disease. We identified 69,940 MNVs and 91 de novo MNVs in 6688 exome-sequenced parent-offspring trios from the Deciphering Developmental Disorders Study comprising families with severe developmental disorders. We replicated the previously described MNV mutational signatures associated with DNA polymerase zeta, an error-prone translesion polymerase, and the APOBEC family of DNA deaminases. We estimate the simultaneous MNV germline mutation rate to be 1.78 × 10-10 mutations per base pair per generation. We found that most MNVs within a single codon create a missense change that could not have been created by a SNV. MNV-induced missense changes were, on average, more physicochemically divergent, were more depleted in highly constrained genes (pLI ≥ 0.9), and were under stronger purifying selection compared with SNV-induced missense changes. We found that de novo MNVs were significantly enriched in genes previously associated with developmental disorders in affected children. This shows that MNVs can be more damaging than SNVs even when both induce missense changes, and are an important variant type to consider in relation to human disease.
Asunto(s)
Discapacidades del Desarrollo/genética , Exoma , Mutación , Niño , Análisis Mutacional de ADN , Humanos , Tasa de Mutación , Mutación Missense , Nucleótidos , Polimorfismo de Nucleótido SimpleRESUMEN
Mutations that perturb normal pre-mRNA splicing are significant contributors to human disease. We used exome sequencing data from 7833 probands with developmental disorders (DDs) and their unaffected parents, as well as more than 60,000 aggregated exomes from the Exome Aggregation Consortium, to investigate selection around the splice sites and quantify the contribution of splicing mutations to DDs. Patterns of purifying selection, a deficit of variants in highly constrained genes in healthy subjects, and excess de novo mutations in patients highlighted particular positions within and around the consensus splice site of greater functional relevance. By using mutational burden analyses in this large cohort of proband-parent trios, we could estimate in an unbiased manner the relative contributions of mutations at canonical dinucleotides (73%) and flanking noncanonical positions (27%), and calculate the positive predictive value of pathogenicity for different classes of mutations. We identified 18 patients with likely diagnostic de novo mutations in dominant DD-associated genes at noncanonical positions in splice sites. We estimate 35%-40% of pathogenic variants in noncanonical splice site positions are missing from public databases.
Asunto(s)
Discapacidades del Desarrollo/genética , Mutación , Sitios de Empalme de ARN , Exoma , Humanos , Secuenciación del ExomaRESUMEN
PURPOSE: Several groups and resources provide information that pertains to the validity of gene-disease relationships used in genomic medicine and research; however, universal standards and terminologies to define the evidence base for the role of a gene in disease and a single harmonized resource were lacking. To tackle this issue, the Gene Curation Coalition (GenCC) was formed. METHODS: The GenCC drafted harmonized definitions for differing levels of gene-disease validity on the basis of existing resources, and performed a modified Delphi survey with 3 rounds to narrow the list of terms. The GenCC also developed a unified database to display curated gene-disease validity assertions from its members. RESULTS: On the basis of 241 survey responses from the genetics community, a consensus term set was chosen for grading gene-disease validity and database submissions. As of December 2021, the database contained 15,241 gene-disease assertions on 4569 unique genes from 12 submitters. When comparing submissions to the database from distinct sources, conflicts in assertions of gene-disease validity ranged from 5.3% to 13.4%. CONCLUSION: Terminology standardization, sharing of gene-disease validity classifications, and resolution of curation conflicts will facilitate collaborations across international curation efforts and in turn, improve consistency in genetic testing and variant interpretation.
Asunto(s)
Bases de Datos Genéticas , Genómica , Pruebas Genéticas , Variación Genética , HumanosRESUMEN
Typical Martsolf syndrome is characterized by congenital cataracts, postnatal microcephaly, developmental delay, hypotonia, short stature and biallelic hypomorphic mutations in either RAB3GAP1 or RAB3GAP2. Genetic analysis of 85 unrelated "mutation negative" probands with Martsolf or Martsolf-like syndromes identified two individuals with different homozygous null mutations in ITPA, the gene encoding inosine triphosphate pyrophosphatase (ITPase). Both probands were from multiplex families with a consistent, lethal and highly distinctive disorder; a Martsolf-like syndrome with infantile-onset dilated cardiomyopathy. Severe ITPase-deficiency has been previously reported with infantile epileptic encephalopathy (MIM 616647). ITPase acts to prevent incorporation of inosine bases (rI/dI) into RNA and DNA. In Itpa-null cells dI was undetectable in genomic DNA. dI could be identified at a low level in mtDNA without detectable mitochondrial genome instability, mtDNA depletion or biochemical dysfunction of the mitochondria. rI accumulation was detectable in proband-derived lymphoblastoid RNA. In Itpa-null mouse embryos rI was detectable in the brain and kidney with the highest level seen in the embryonic heart (rI at 1 in 385 bases). Transcriptome and proteome analysis in mutant cells revealed no major differences with controls. The rate of transcription and the total amount of cellular RNA also appeared normal. rI accumulation in RNA-and by implication rI production-correlates with the severity of organ dysfunction in ITPase deficiency but the basis of the cellulopathy remains cryptic. While we cannot exclude cumulative minor effects, there are no major anomalies in the production, processing, stability and/or translation of mRNA.