RESUMO
Variant interpretation remains a major challenge in medical genetics. We developed Meta-Domain HotSpot (MDHS) to identify mutational hotspots across homologous protein domains. We applied MDHS to a dataset of 45,221 de novo mutations (DNMs) from 31,058 individuals with neurodevelopmental disorders (NDDs) and identified three significantly enriched missense DNM hotspots in the ion transport protein domain family (PF00520). The 37 unique missense DNMs that drive enrichment affect 25 genes, 19 of which were previously associated with NDDs. 3D protein structure modeling supports the hypothesis of function-altering effects of these mutations. Hotspot genes have a unique expression pattern in tissue, and we used this pattern alongside in silico predictors and population constraint information to identify candidate NDD-associated genes. We also propose a lenient version of our method, which identifies 32 hotspot positions across 16 different protein domains. These positions are enriched for likely pathogenic variation in clinical databases and DNMs in other genetic disorders.
Assuntos
Transtornos do Neurodesenvolvimento , Humanos , Domínios Proteicos/genética , Mutação/genética , Transtornos do Neurodesenvolvimento/genéticaRESUMO
De novo mutations in protein-coding genes are a well-established cause of developmental disorders1. However, genes known to be associated with developmental disorders account for only a minority of the observed excess of such de novo mutations1,2. Here, to identify previously undescribed genes associated with developmental disorders, we integrate healthcare and research exome-sequence data from 31,058 parent-offspring trios of individuals with developmental disorders, and develop a simulation-based statistical test to identify gene-specific enrichment of de novo mutations. We identified 285 genes that were significantly associated with developmental disorders, including 28 that had not previously been robustly associated with developmental disorders. Although we detected more genes associated with developmental disorders, much of the excess of de novo mutations in protein-coding genes remains unaccounted for. Modelling suggests that more than 1,000 genes associated with developmental disorders have not yet been described, many of which are likely to be less penetrant than the currently known genes. Research access to clinical diagnostic datasets will be critical for completing the map of genes associated with developmental disorders.
Assuntos
Análise Mutacional de DNA , Análise de Dados , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Atenção à Saúde/estatística & dados numéricos , Deficiências do Desenvolvimento/genética , Doenças Genéticas Inatas/genética , Estudos de Coortes , Variações do Número de Cópias de DNA/genética , Deficiências do Desenvolvimento/diagnóstico , Europa (Continente) , Feminino , Doenças Genéticas Inatas/diagnóstico , Mutação em Linhagem Germinativa/genética , Haploinsuficiência/genética , Humanos , Masculino , Mutação de Sentido Incorreto/genética , Penetrância , Morte Perinatal , Tamanho da AmostraRESUMO
Whereas large-scale statistical analyses can robustly identify disease-gene relationships, they do not accurately capture genotype-phenotype correlations or disease mechanisms. We use multiple lines of independent evidence to show that different variant types in a single gene, SATB1, cause clinically overlapping but distinct neurodevelopmental disorders. Clinical evaluation of 42 individuals carrying SATB1 variants identified overt genotype-phenotype relationships, associated with different pathophysiological mechanisms, established by functional assays. Missense variants in the CUT1 and CUT2 DNA-binding domains result in stronger chromatin binding, increased transcriptional repression, and a severe phenotype. In contrast, variants predicted to result in haploinsufficiency are associated with a milder clinical presentation. A similarly mild phenotype is observed for individuals with premature protein truncating variants that escape nonsense-mediated decay, which are transcriptionally active but mislocalized in the cell. Our results suggest that in-depth mutation-specific genotype-phenotype studies are essential to capture full disease complexity and to explain phenotypic variability.
Assuntos
Proteínas de Ligação à Região de Interação com a Matriz/genética , Mutação , Transtornos do Neurodesenvolvimento/genética , Cromatina/metabolismo , Feminino , Estudos de Associação Genética , Haploinsuficiência , Humanos , Masculino , Proteínas de Ligação à Região de Interação com a Matriz/química , Proteínas de Ligação à Região de Interação com a Matriz/metabolismo , Modelos Moleculares , Mutação de Sentido Incorreto , Ligação Proteica , Domínios Proteicos , Transcrição GênicaRESUMO
Recurrent somatic variants in SPOP are cancer specific; endometrial and prostate cancers result from gain-of-function and dominant-negative effects toward BET proteins, respectively. By using clinical exome sequencing, we identified six de novo pathogenic missense variants in SPOP in seven individuals with developmental delay and/or intellectual disability, facial dysmorphisms, and congenital anomalies. Two individuals shared craniofacial dysmorphisms, including congenital microcephaly, that were strikingly different from those of the other five individuals, who had (relative) macrocephaly and hypertelorism. We measured the effect of SPOP variants on BET protein amounts in human Ishikawa endometrial cancer cells and patient-derived cell lines because we hypothesized that variants would lead to functional divergent effects on BET proteins. The de novo variants c.362G>A (p.Arg121Gln) and c. 430G>A (p.Asp144Asn), identified in the first two individuals, resulted in a gain of function, and conversely, the c.73A>G (p.Thr25Ala), c.248A>G (p.Tyr83Cys), c.395G>T (p.Gly132Val), and c.412C>T (p.Arg138Cys) variants resulted in a dominant-negative effect. Our findings suggest that these opposite functional effects caused by the variants in SPOP result in two distinct and clinically recognizable syndromic forms of intellectual disability with contrasting craniofacial dysmorphisms.
Assuntos
Mutação de Sentido Incorreto , Transtornos do Neurodesenvolvimento/genética , Proteínas Nucleares/genética , Proteínas Repressoras/genética , Adolescente , Criança , Pré-Escolar , Fácies , Feminino , Humanos , Lactente , Deficiência Intelectual/genética , Masculino , Crânio/anormalidades , Adulto JovemRESUMO
POU3F3, also referred to as Brain-1, is a well-known transcription factor involved in the development of the central nervous system, but it has not previously been associated with a neurodevelopmental disorder. Here, we report the identification of 19 individuals with heterozygous POU3F3 disruptions, most of which are de novo variants. All individuals had developmental delays and/or intellectual disability and impairments in speech and language skills. Thirteen individuals had characteristic low-set, prominent, and/or cupped ears. Brain abnormalities were observed in seven of eleven MRI reports. POU3F3 is an intronless gene, insensitive to nonsense-mediated decay, and 13 individuals carried protein-truncating variants. All truncating variants that we tested in cellular models led to aberrant subcellular localization of the encoded protein. Luciferase assays demonstrated negative effects of these alleles on transcriptional activation of a reporter with a FOXP2-derived binding motif. In addition to the loss-of-function variants, five individuals had missense variants that clustered at specific positions within the functional domains, and one small in-frame deletion was identified. Two missense variants showed reduced transactivation capacity in our assays, whereas one variant displayed gain-of-function effects, suggesting a distinct pathophysiological mechanism. In bioluminescence resonance energy transfer (BRET) interaction assays, all the truncated POU3F3 versions that we tested had significantly impaired dimerization capacities, whereas all missense variants showed unaffected dimerization with wild-type POU3F3. Taken together, our identification and functional cell-based analyses of pathogenic variants in POU3F3, coupled with a clinical characterization, implicate disruptions of this gene in a characteristic neurodevelopmental disorder.
Assuntos
Regulação da Expressão Gênica , Mutação , Transtornos do Neurodesenvolvimento/etiologia , Fatores do Domínio POU/genética , Ativação Transcricional , Sequência de Aminoácidos , Criança , Feminino , Estudos de Associação Genética , Genótipo , Humanos , Masculino , Transtornos do Neurodesenvolvimento/patologia , Fatores do Domínio POU/química , Conformação Proteica , Homologia de SequênciaRESUMO
By using exome sequencing and a gene matching approach, we identified de novo and inherited pathogenic variants in KDM3B in 14 unrelated individuals and three affected parents with varying degrees of intellectual disability (ID) or developmental delay (DD) and short stature. The individuals share additional phenotypic features that include feeding difficulties in infancy, joint hypermobility, and characteristic facial features such as a wide mouth, a pointed chin, long ears, and a low columella. Notably, two individuals developed cancer, acute myeloid leukemia and Hodgkin lymphoma, in childhood. KDM3B encodes for a histone demethylase and is involved in H3K9 demethylation, a crucial part of chromatin modification required for transcriptional regulation. We identified missense and truncating variants, suggesting that KDM3B haploinsufficiency is the underlying mechanism for this syndrome. By using a hybrid facial-recognition model, we show that individuals with a pathogenic variant in KDM3B have a facial gestalt, and that they show significant facial similarity compared to control individuals with ID. In conclusion, pathogenic variants in KDM3B cause a syndrome characterized by ID, short stature, and facial dysmorphism.
Assuntos
Anormalidades Craniofaciais/genética , Deficiências do Desenvolvimento/genética , Nanismo/genética , Variação Genética , Deficiência Intelectual/genética , Histona Desmetilases com o Domínio Jumonji/genética , Anormalidades Musculoesqueléticas/genética , Estatura , Criança , Exoma , Face , Feminino , Estudos de Associação Genética , Mutação em Linhagem Germinativa , Haploinsuficiência , Histonas/química , Humanos , Masculino , Mutação de Sentido Incorreto , FenótipoRESUMO
Haploinsufficiency (HI) is the best characterized mechanism through which dominant mutations exert their effect and cause disease. Non-haploinsufficiency (NHI) mechanisms, such as gain-of-function and dominant-negative mechanisms, are often characterized by the spatial clustering of mutations, thereby affecting only particular regions or base pairs of a gene. Variants leading to haploinsufficency might occasionally cluster as well, for example in critical domains, but such clustering is on the whole less pronounced with mutations often spread throughout the gene. Here we exploit this property and develop a method to specifically identify genes with significant spatial clustering patterns of de novo mutations in large cohorts. We apply our method to a dataset of 4,061 de novo missense mutations from published exome studies of trios with intellectual disability and developmental disorders (ID/DD) and successfully identify 15 genes with clustering mutations, including 12 genes for which mutations are known to cause neurodevelopmental disorders. For 11 out of these 12, NHI mutation mechanisms have been reported. Additionally, we identify three candidate ID/DD-associated genes of which two have an established role in neuronal processes. We further observe a higher intolerance to normal genetic variation of the identified genes compared to known genes for which mutations lead to HI. Finally, 3D modeling of these mutations on their protein structures shows that 81% of the observed mutations are unlikely to affect the overall structural integrity and that they therefore most likely act through a mechanism other than HI.
Assuntos
Exoma/genética , Marcadores Genéticos , Haploinsuficiência , Mutação de Sentido Incorreto , Transtornos do Neurodesenvolvimento/genética , Humanos , Transtornos do Neurodesenvolvimento/patologia , Conformação ProteicaRESUMO
PURPOSE: To delineate the genotype-phenotype correlation in individuals with likely pathogenic variants in the CLTC gene. METHODS: We describe 13 individuals with de novo CLTC variants. Causality of variants was determined by using the tolerance landscape of CLTC and computer-assisted molecular modeling where applicable. Phenotypic abnormalities observed in the individuals identified with missense and in-frame variants were compared with those with nonsense or frameshift variants in CLTC. RESULTS: All de novo variants were judged to be causal. Combining our data with that of 14 previously reported affected individuals (n = 27), all had intellectual disability (ID), ranging from mild to moderate/severe, with or without additional neurologic, behavioral, craniofacial, ophthalmologic, and gastrointestinal features. Microcephaly, hypoplasia of the corpus callosum, and epilepsy were more frequently observed in individuals with missense and in-frame variants than in those with nonsense and frameshift variants. However, this difference was not significant. CONCLUSIONS: The wide phenotypic variability associated with likely pathogenic CLTC variants seems to be associated with allelic heterogeneity. The detailed clinical characterization of a larger cohort of individuals with pathogenic CLTC variants is warranted to support the hypothesis that missense and in-frame variants exert a dominant-negative effect, whereas the nonsense and frameshift variants would result in haploinsufficiency.
Assuntos
Epilepsia , Deficiência Intelectual , Microcefalia , Variação Biológica da População , Corpo Caloso , Epilepsia/genética , Humanos , Deficiência Intelectual/genética , Microcefalia/genética , FenótipoRESUMO
The growing availability of human genetic variation has given rise to novel methods of measuring genetic tolerance that better interpret variants of unknown significance. We recently developed a concept based on protein domain homology in the human genome to improve variant interpretation. For this purpose, we mapped population variation from the Exome Aggregation Consortium (ExAC) and pathogenic mutations from the Human Gene Mutation Database (HGMD) onto Pfam protein domains. The aggregation of these variation data across homologous domains into meta-domains allowed us to generate amino acid resolution of genetic intolerance profiles for human protein domains. Here, we developed MetaDome, a fast and easy-to-use web server that visualizes meta-domain information and gene-wide profiles of genetic tolerance. We updated the underlying data of MetaDome to contain information from 56,319 human transcripts, 71,419 protein domains, 12,164,292 genetic variants from gnomAD, and 34,076 pathogenic mutations from ClinVar. MetaDome allows researchers to easily investigate their variants of interest for the presence or absence of variation at corresponding positions within homologous domains. We illustrate the added value of MetaDome by an example that highlights how it may help in the interpretation of variants of unknown significance. The MetaDome web server is freely accessible at https://stuart.radboudumc.nl/metadome.
Assuntos
Biologia Computacional/métodos , Variação Genética , Proteínas/química , Proteínas/genética , Bases de Dados Genéticas , Predisposição Genética para Doença , Genoma Humano , Humanos , Internet , Domínios Proteicos , Software , Homologia Estrutural de ProteínaRESUMO
Hotspots of rapid genome evolution hold clues about human adaptation. We present a comparative analysis of nine whole-genome sequenced primates to identify high-confidence targets of positive selection. We find strong statistical evidence for positive selection in 331 protein-coding genes (3%), pinpointing 934 adaptively evolving codons (0.014%). Our new procedure is stringent and reveals substantial artefacts (20% of initial predictions) that have inflated previous estimates. The final 331 positively selected genes (PSG) are strongly enriched for innate and adaptive immunity, secreted and cell membrane proteins (e.g. pattern recognition, complement, cytokines, immune receptors, MHC, Siglecs). We also find evidence for positive selection in reproduction and chromosome segregation (e.g. centromere-associated CENPO, CENPT), apolipoproteins, smell/taste receptors and mitochondrial proteins. Focusing on the virus-host interaction, we retrieve most evolutionary conflicts known to influence antiviral activity (e.g. TRIM5, MAVS, SAMHD1, tetherin) and predict 70 novel cases through integration with virus-human interaction data. Protein structure analysis further identifies positive selection in the interaction interfaces between viruses and their cellular receptors (CD4-HIV; CD46-measles, adenoviruses; CD55-picornaviruses). Finally, primate PSG consistently show high sequence variation in human exomes, suggesting ongoing evolution. Our curated dataset of positive selection is a rich source for studying the genetics underlying human (antiviral) phenotypes. Procedures and data are available at https://github.com/robinvanderlee/positive-selection.
Assuntos
Evolução Molecular , Seleção Genética , Animais , Artefatos , Conversão Gênica , Variação Genética , Genômica , Interações Hospedeiro-Patógeno/genética , Humanos , Imunidade/genética , Família Multigênica , Primatas/genética , Proteínas/genética , Receptores Virais/química , Proteínas Virais/química , Viroses/genéticaRESUMO
Unraveling the causes and pathomechanisms of progressive disorders is essential for the development of therapeutic strategies. Here, we identified heterozygous pathogenic missense variants of LMX1A in two families of Dutch origin with progressive nonsyndromic hearing impairment (HI), using whole exome sequencing. One variant, c.721G > C (p.Val241Leu), occurred de novo and is predicted to affect the homeodomain of LMX1A, which is essential for DNA binding. The second variant, c.290G > C (p.Cys97Ser), predicted to affect a zinc-binding residue of the second LIM domain that is involved in protein-protein interactions. Bi-allelic deleterious variants of Lmx1a are associated with a complex phenotype in mice, including deafness and vestibular defects, due to arrest of inner ear development. Although Lmx1a mouse mutants demonstrate neurological, skeletal, pigmentation and reproductive system abnormalities, no syndromic features were present in the participating subjects of either family. LMX1A has previously been suggested as a candidate gene for intellectual disability, but our data do not support this, as affected subjects displayed normal cognition. Large variability was observed in the age of onset (a)symmetry, severity and progression rate of HI. About half of the affected individuals displayed vestibular dysfunction and experienced symptoms thereof. The late-onset progressive phenotype and the absence of cochleovestibular malformations on computed tomography scans indicate that heterozygous defects of LMX1A do not result in severe developmental abnormalities in humans. We propose that a single LMX1A wild-type copy is sufficient for normal development but insufficient for maintenance of cochleovestibular function. Alternatively, minor cochleovestibular developmental abnormalities could eventually lead to the progressive phenotype seen in the families.
Assuntos
Perda Auditiva/genética , Heterozigoto , Proteínas com Homeodomínio LIM/genética , Mutação de Sentido Incorreto , Fatores de Transcrição/genética , Doenças Vestibulares/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Substituição de Aminoácidos , Pré-Escolar , Feminino , Humanos , Masculino , Pessoa de Meia-IdadeAssuntos
Genoma Humano , Análise de Sequência de DNA , Humanos , Genômica , Padrões de Referência , TelômeroRESUMO
Whole exomes of patients with a genetic disorder are nowadays routinely sequenced but interpretation of the identified genetic variants remains a major challenge. The increased availability of population-based human genetic variation has given rise to measures of genetic tolerance that have been used, for example, to predict disease-causing genes in neurodevelopmental disorders. Here, we investigated whether combining variant information from homologous protein domains can improve variant interpretation. For this purpose, we developed a framework that maps population variation and known pathogenic mutations onto 2,750 "meta-domains." These meta-domains consist of 30,853 homologous Pfam protein domain instances that cover 36% of all human protein coding sequences. We find that genetic tolerance is consistent across protein domain homologues, and that patterns of genetic tolerance faithfully mimic patterns of evolutionary conservation. Furthermore, for a significant fraction (68%) of the meta-domains high-frequency population variation re-occurs at the same positions across domain homologues more often than expected. In addition, we observe that the presence of pathogenic missense variants at an aligned homologous domain position is often paired with the absence of population variation and vice versa. The use of these meta-domains can improve the interpretation of genetic variation.
Assuntos
Testes Genéticos , Variação Genética , Genética Populacional , Domínios Proteicos/genética , Adaptação Biológica/genética , Mapeamento Cromossômico , Biologia Computacional/métodos , Sequência Conservada , Evolução Molecular , Exoma , Ontologia Genética , Genética Populacional/métodos , Genômica/métodos , Genótipo , Humanos , Sequenciamento do ExomaRESUMO
Regular exercise has many physical and brain health benefits, yet the molecular mechanisms mediating exercise effects across tissues remain poorly understood. Here we analyzed 400 high-quality DNA methylation, ATAC-seq, and RNA-seq datasets from eight tissues from control and endurance exercise-trained (EET) rats. Integration of baseline datasets mapped the gene location dependence of epigenetic control features and identified differing regulatory landscapes in each tissue. The transcriptional responses to 8 weeks of EET showed little overlap across tissues and predominantly comprised tissue-type enriched genes. We identified sex differences in the transcriptomic and epigenomic changes induced by EET. However, the sex-biased gene responses were linked to shared signaling pathways. We found that many G protein-coupled receptor-encoding genes are regulated by EET, suggesting a role for these receptors in mediating the molecular adaptations to training across tissues. Our findings provide new insights into the mechanisms underlying EET-induced health benefits across organs.
Assuntos
Condicionamento Físico Animal , Transcriptoma , Animais , Condicionamento Físico Animal/fisiologia , Masculino , Ratos , Feminino , Metilação de DNA , Epigênese Genética , Epigenômica , Adaptação Fisiológica/genética , Especificidade de Órgãos , Ratos Sprague-DawleyRESUMO
Each human genome has tens of thousands of rare genetic variants; however, identifying impactful rare variants remains a major challenge. We demonstrate how use of personal multi-omics can enable identification of impactful rare variants by using the Multi-Ethnic Study of Atherosclerosis, which included several hundred individuals, with whole-genome sequencing, transcriptomes, methylomes, and proteomes collected across two time points, 10 years apart. We evaluated each multi-omics phenotype's ability to separately and jointly inform functional rare variation. By combining expression and protein data, we observed rare stop variants 62 times and rare frameshift variants 216 times as frequently as controls, compared to 13-27 times as frequently for expression or protein effects alone. We extended a Bayesian hierarchical model, "Watershed," to prioritize specific rare variants underlying multi-omics signals across the regulatory cascade. With this approach, we identified rare variants that exhibited large effect sizes on multiple complex traits including height, schizophrenia, and Alzheimer's disease.