Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 113
Filter
Add more filters

Publication year range
1.
Nature ; 625(7993): 92-100, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38057664

ABSTRACT

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.


Subject(s)
Genome, Human , Genomics , Models, Genetic , Mutation , Humans , Access to Information , Databases, Genetic , Datasets as Topic , Gene Frequency , Genome, Human/genetics , Mutation/genetics , Selection, Genetic
2.
Am J Hum Genet ; 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-39013459

ABSTRACT

Trithorax-related H3K4 methyltransferases, KMT2C and KMT2D, are critical epigenetic modifiers. Haploinsufficiency of KMT2C was only recently recognized as a cause of neurodevelopmental disorder (NDD), so the clinical and molecular spectrums of the KMT2C-related NDD (now designated as Kleefstra syndrome 2) are largely unknown. We ascertained 98 individuals with rare KMT2C variants, including 75 with protein-truncating variants (PTVs). Notably, ∼15% of KMT2C PTVs were inherited. Although the most highly expressed KMT2C transcript consists of only the last four exons, pathogenic PTVs were found in almost all the exons of this large gene. KMT2C variant interpretation can be challenging due to segmental duplications and clonal hematopoesis-induced artifacts. Using samples from 27 affected individuals, divided into discovery and validation cohorts, we generated a moderate strength disorder-specific KMT2C DNA methylation (DNAm) signature and demonstrate its utility in classifying non-truncating variants. Based on 81 individuals with pathogenic/likely pathogenic variants, we demonstrate that the KMT2C-related NDD is characterized by developmental delay, intellectual disability, behavioral and psychiatric problems, hypotonia, seizures, short stature, and other comorbidities. The facial module of PhenoScore, applied to photographs of 34 affected individuals, reveals that the KMT2C-related facial gestalt is significantly different from the general NDD population. Finally, using PhenoScore and DNAm signatures, we demonstrate that the KMT2C-related NDD is clinically and epigenetically distinct from Kleefstra and Kabuki syndromes. Overall, we define the clinical features, molecular spectrum, and DNAm signature of the KMT2C-related NDD and demonstrate they are distinct from Kleefstra and Kabuki syndromes highlighting the need to rename this condition.

3.
Am J Hum Genet ; 110(9): 1496-1508, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37633279

ABSTRACT

Predicted loss of function (pLoF) variants are often highly deleterious and play an important role in disease biology, but many pLoF variants may not result in loss of function (LoF). Here we present a framework that advances interpretation of pLoF variants in research and clinical settings by considering three categories of LoF evasion: (1) predicted rescue by secondary sequence properties, (2) uncertain biological relevance, and (3) potential technical artifacts. We also provide recommendations on adjustments to ACMG/AMP guidelines' PVS1 criterion. Applying this framework to all high-confidence pLoF variants in 22 genes associated with autosomal-recessive disease from the Genome Aggregation Database (gnomAD v.2.1.1) revealed predicted LoF evasion or potential artifacts in 27.3% (304/1,113) of variants. The major reasons were location in the last exon, in a homopolymer repeat, in a low proportion expressed across transcripts (pext) scored region, or the presence of cryptic in-frame splice rescues. Variants predicted to evade LoF or to be potential artifacts were enriched for ClinVar benign variants. PVS1 was downgraded in 99.4% (162/163) of pLoF variants predicted as likely not LoF/not LoF, with 17.2% (28/163) downgraded as a result of our framework, adding to previous guidelines. Variant pathogenicity was affected (mostly from likely pathogenic to VUS) in 20 (71.4%) of these 28 variants. This framework guides assessment of pLoF variants beyond standard annotation pipelines and substantially reduces false positive rates, which is key to ensure accurate LoF variant prediction in both a research and clinical setting.


Subject(s)
Inheritance Patterns , Humans , Exons , Uncertainty
4.
Am J Hum Genet ; 110(8): 1229-1248, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37541186

ABSTRACT

Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order, and emerging technologies, such as optical genome mapping and long-read DNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to research consortia focused on elucidating the underlying cause of rare unsolved genetic disorders.


Subject(s)
Exome , Genetic Testing , Humans , Exome/genetics , Sequence Analysis, DNA , Phenotype , Exome Sequencing , Rare Diseases
5.
Am J Hum Genet ; 110(3): 499-515, 2023 03 02.
Article in English | MEDLINE | ID: mdl-36724785

ABSTRACT

Telomere maintenance 2 (TELO2), Tel2 interacting protein 2 (TTI2), and Tel2 interacting protein 1 (TTI1) are the three components of the conserved Triple T (TTT) complex that modulates activity of phosphatidylinositol 3-kinase-related protein kinases (PIKKs), including mTOR, ATM, and ATR, by regulating the assembly of mTOR complex 1 (mTORC1). The TTT complex is essential for the expression, maturation, and stability of ATM and ATR in response to DNA damage. TELO2- and TTI2-related bi-allelic autosomal-recessive (AR) encephalopathies have been described in individuals with moderate to severe intellectual disability (ID), short stature, postnatal microcephaly, and a movement disorder (in the case of variants within TELO2). We present clinical, genomic, and functional data from 11 individuals in 9 unrelated families with bi-allelic variants in TTI1. All present with ID, and most with microcephaly, short stature, and a movement disorder. Functional studies performed in HEK293T cell lines and fibroblasts and lymphoblastoid cells derived from 4 unrelated individuals showed impairment of the TTT complex and of mTOR pathway activity which is improved by treatment with Rapamycin. Our data delineate a TTI1-related neurodevelopmental disorder and expand the group of disorders related to the TTT complex.


Subject(s)
Microcephaly , Movement Disorders , Neurodevelopmental Disorders , Humans , Intracellular Signaling Peptides and Proteins , HEK293 Cells , TOR Serine-Threonine Kinases
6.
Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37595579

ABSTRACT

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.


Subject(s)
Autism Spectrum Disorder , Female , Pregnancy , Humans , Autism Spectrum Disorder/diagnosis , Autism Spectrum Disorder/genetics , Pregnancy Trimester, First , Ultrasonography, Prenatal , Chromosome Mapping , Exome
7.
Nature ; 581(7809): 452-458, 2020 05.
Article in English | MEDLINE | ID: mdl-32461655

ABSTRACT

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.


Subject(s)
Disease/genetics , Haploinsufficiency/genetics , Loss of Function Mutation/genetics , Molecular Sequence Annotation , Transcription, Genetic , Transcriptome/genetics , Autism Spectrum Disorder/genetics , Datasets as Topic , Developmental Disabilities/genetics , Exons/genetics , Female , Genotype , Humans , Intellectual Disability/genetics , Male , Molecular Sequence Annotation/standards , Poisson Distribution , RNA, Messenger/analysis , RNA, Messenger/genetics , Rare Diseases/diagnosis , Rare Diseases/genetics , Reproducibility of Results , Exome Sequencing
8.
Nature ; 581(7809): 444-451, 2020 05.
Article in English | MEDLINE | ID: mdl-32461652

ABSTRACT

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.


Subject(s)
Disease/genetics , Genetic Variation , Genetics, Medical/standards , Genetics, Population/standards , Genome, Human/genetics , Female , Genetic Testing , Genotyping Techniques , Humans , Male , Middle Aged , Mutation , Polymorphism, Single Nucleotide/genetics , Racial Groups/genetics , Reference Standards , Selection, Genetic , Whole Genome Sequencing
9.
Nature ; 581(7809): 434-443, 2020 05.
Article in English | MEDLINE | ID: mdl-32461654

ABSTRACT

Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.


Subject(s)
Exome/genetics , Genes, Essential/genetics , Genetic Variation/genetics , Genome, Human/genetics , Adult , Brain/metabolism , Cardiovascular Diseases/genetics , Cohort Studies , Databases, Genetic , Female , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Humans , Loss of Function Mutation/genetics , Male , Mutation Rate , Proprotein Convertase 9/genetics , RNA, Messenger/genetics , Reproducibility of Results , Exome Sequencing , Whole Genome Sequencing
10.
Am J Hum Genet ; 109(12): 2163-2177, 2022 12 01.
Article in English | MEDLINE | ID: mdl-36413997

ABSTRACT

Recommendations from the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) for interpreting sequence variants specify the use of computational predictors as "supporting" level of evidence for pathogenicity or benignity using criteria PP3 and BP4, respectively. However, score intervals defined by tool developers, and ACMG/AMP recommendations that require the consensus of multiple predictors, lack quantitative support. Previously, we described a probabilistic framework that quantified the strengths of evidence (supporting, moderate, strong, very strong) within ACMG/AMP recommendations. We have extended this framework to computational predictors and introduce a new standard that converts a tool's scores to PP3 and BP4 evidence strengths. Our approach is based on estimating the local positive predictive value and can calibrate any computational tool or other continuous-scale evidence on any variant type. We estimate thresholds (score intervals) corresponding to each strength of evidence for pathogenicity and benignity for thirteen missense variant interpretation tools, using carefully assembled independent data sets. Most tools achieved supporting evidence level for both pathogenic and benign classification using newly established thresholds. Multiple tools reached score thresholds justifying moderate and several reached strong evidence levels. One tool reached very strong evidence level for benign classification on some variants. Based on these findings, we provide recommendations for evidence-based revisions of the PP3 and BP4 ACMG/AMP criteria using individual tools and future assessment of computational methods for clinical interpretation.


Subject(s)
Calibration , Humans , Consensus , Educational Status , Virulence
11.
Am J Hum Genet ; 109(4): 750-758, 2022 04 07.
Article in English | MEDLINE | ID: mdl-35202563

ABSTRACT

Chromatin is essentially an array of nucleosomes, each of which consists of the DNA double-stranded fiber wrapped around a histone octamer. This organization supports cellular processes such as DNA replication, DNA transcription, and DNA repair in all eukaryotes. Human histone H4 is encoded by fourteen canonical histone H4 genes, all differing at the nucleotide level but encoding an invariant protein. Here, we present a cohort of 29 subjects with de novo missense variants in six H4 genes (H4C3, H4C4, H4C5, H4C6, H4C9, and H4C11) identified by whole-exome sequencing and matchmaking. All individuals present with neurodevelopmental features of intellectual disability and motor and/or gross developmental delay, while non-neurological features are more variable. Ten amino acids are affected, six recurrently, and are all located within the H4 core or C-terminal tail. These variants cluster to specific regions of the core H4 globular domain, where protein-protein interactions occur with either other histone subunits or histone chaperones. Functional consequences of the identified variants were evaluated in zebrafish embryos, which displayed abnormal general development, defective head organs, and reduced body axis length, providing compelling evidence for the causality of the reported disorder(s). While multiple developmental syndromes have been linked to chromatin-associated factors, missense-bearing histone variants (e.g., H3 oncohistones) are only recently emerging as a major cause of pathogenicity. Our findings establish a broader involvement of H4 variants in developmental syndromes.


Subject(s)
Histones , Zebrafish , Animals , Chromatin , DNA , Histones/metabolism , Humans , Syndrome , Zebrafish/genetics , Zebrafish/metabolism
13.
Am J Hum Genet ; 108(8): 1450-1465, 2021 08 05.
Article in English | MEDLINE | ID: mdl-34186028

ABSTRACT

The genetic causes of global developmental delay (GDD) and intellectual disability (ID) are diverse and include variants in numerous ion channels and transporters. Loss-of-function variants in all five endosomal/lysosomal members of the CLC family of Cl- channels and Cl-/H+ exchangers lead to pathology in mice, humans, or both. We have identified nine variants in CLCN3, the gene encoding CIC-3, in 11 individuals with GDD/ID and neurodevelopmental disorders of varying severity. In addition to a homozygous frameshift variant in two siblings, we identified eight different heterozygous de novo missense variants. All have GDD/ID, mood or behavioral disorders, and dysmorphic features; 9/11 have structural brain abnormalities; and 6/11 have seizures. The homozygous variants are predicted to cause loss of ClC-3 function, resulting in severe neurological disease similar to the phenotype observed in Clcn3-/- mice. Their MRIs show possible neurodegeneration with thin corpora callosa and decreased white matter volumes. Individuals with heterozygous variants had a range of neurodevelopmental anomalies including agenesis of the corpus callosum, pons hypoplasia, and increased gyral folding. To characterize the altered function of the exchanger, electrophysiological analyses were performed in Xenopus oocytes and mammalian cells. Two variants, p.Ile607Thr and p.Thr570Ile, had increased currents at negative cytoplasmic voltages and loss of inhibition by luminal acidic pH. In contrast, two other variants showed no significant difference in the current properties. Overall, our work establishes a role for CLCN3 in human neurodevelopment and shows that both homozygous loss of ClC-3 and heterozygous variants can lead to GDD/ID and neuroanatomical abnormalities.


Subject(s)
Chloride Channels/genetics , Disease Models, Animal , Ion Channels/physiology , Mutation , Neurodevelopmental Disorders/pathology , Phenotype , Adolescent , Animals , Child , Child, Preschool , Female , Homozygote , Humans , Infant , Infant, Newborn , Male , Mice , Mice, Knockout , Neurodevelopmental Disorders/etiology , Neurodevelopmental Disorders/metabolism
14.
Am J Hum Genet ; 108(5): 840-856, 2021 05 06.
Article in English | MEDLINE | ID: mdl-33861953

ABSTRACT

JAG2 encodes the Notch ligand Jagged2. The conserved Notch signaling pathway contributes to the development and homeostasis of multiple tissues, including skeletal muscle. We studied an international cohort of 23 individuals with genetically unsolved muscular dystrophy from 13 unrelated families. Whole-exome sequencing identified rare homozygous or compound heterozygous JAG2 variants in all 13 families. The identified bi-allelic variants include 10 missense variants that disrupt highly conserved amino acids, a nonsense variant, two frameshift variants, an in-frame deletion, and a microdeletion encompassing JAG2. Onset of muscle weakness occurred from infancy to young adulthood. Serum creatine kinase (CK) levels were normal or mildly elevated. Muscle histology was primarily dystrophic. MRI of the lower extremities revealed a distinct, slightly asymmetric pattern of muscle involvement with cores of preserved and affected muscles in quadriceps and tibialis anterior, in some cases resembling patterns seen in POGLUT1-associated muscular dystrophy. Transcriptome analysis of muscle tissue from two participants suggested misregulation of genes involved in myogenesis, including PAX7. In complementary studies, Jag2 downregulation in murine myoblasts led to downregulation of multiple components of the Notch pathway, including Megf10. Investigations in Drosophila suggested an interaction between Serrate and Drpr, the fly orthologs of JAG1/JAG2 and MEGF10, respectively. In silico analysis predicted that many Jagged2 missense variants are associated with structural changes and protein misfolding. In summary, we describe a muscular dystrophy associated with pathogenic variants in JAG2 and evidence suggests a disease mechanism related to Notch pathway dysfunction.


Subject(s)
Jagged-2 Protein/genetics , Muscular Dystrophies/genetics , Adolescent , Adult , Amino Acid Sequence , Animals , Cell Line , Child , Child, Preschool , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Female , Glucosyltransferases/genetics , Haplotypes/genetics , Humans , Jagged-1 Protein/genetics , Jagged-2 Protein/chemistry , Jagged-2 Protein/deficiency , Jagged-2 Protein/metabolism , Male , Membrane Proteins/genetics , Mice , Middle Aged , Models, Molecular , Muscles/metabolism , Muscles/pathology , Muscular Dystrophies/pathology , Myoblasts/metabolism , Myoblasts/pathology , Pedigree , Phenotype , Receptors, Notch/metabolism , Signal Transduction , Exome Sequencing , Young Adult
15.
Genet Med ; 26(4): 101073, 2024 04.
Article in English | MEDLINE | ID: mdl-38245859

ABSTRACT

PURPOSE: The 100,000 Genomes Project diagnosed a quarter of affected participants, but 26% of diagnoses were not on the applied gene panel(s); with many being de novo variants. Assessing biallelic variants without a gene panel is more challenging. METHODS: We sought to identify missed biallelic diagnoses using GenePy, which incorporates allele frequency, zygosity, and a user-defined deleterious metric, generating an aggregate GenePy score per gene, per participant. We calculated GenePy scores for 2862 recessive disease genes in 78,216 100,000 Genomes Project participants. For each gene, we ranked participant GenePy scores and scrutinized affected participants without a diagnosis, whose scores ranked among the top 5 for each gene. In cases which participant phenotypes overlapped with the disease gene of interest, we extracted rare variants and applied phase, ClinVar, and ACMG classification. RESULTS: 3184 affected individuals without a molecular diagnosis had a top-5-ranked GenePy score and 682 of 3184 (21%) had phenotypes overlapping with a top-ranking gene. In 122 of 669 (18%) phenotype-matched cases (excluding 13 withdrawn participants), we identified a putative missed diagnosis (2.2% of all undiagnosed participants). A further 334 of 669 (50%) cases have a possible missed diagnosis but require functional validation. CONCLUSION: Applying GenePy at scale has identified 456 potential diagnoses, demonstrating the value of novel diagnostic strategies.


Subject(s)
Missed Diagnosis , Humans , Virulence , Gene Frequency/genetics , Phenotype , Genes, Recessive
16.
Hum Genet ; 2023 Feb 04.
Article in English | MEDLINE | ID: mdl-36739343

ABSTRACT

Reference population databases like the Genome Aggregation Database (gnomAD) have improved our ability to interpret the human genome. Variant frequencies and frequency-derived tools (such as depletion scores) have become fundamental to variant interpretation and the assessment of variant-gene-disease relationships. Clonal hematopoiesis (CH) obstructs variant interpretation as somatic variants that provide proliferative advantage will affect variant frequencies, depletion scores, and downstream filtering. Further, default filtering of variants or genes associated with CH risks filtering bona fide germline variants as variants associated with CH can also cause Mendelian conditions. Here, we provide our insights on interpreting population variant data in genes affected by clonal hematopoiesis, as well as recommendations for careful review of 36 established CH genes associated with neurodevelopmental conditions.

17.
Hum Genet ; 142(3): 351-362, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36477409

ABSTRACT

BACKGROUND: Genome sequencing was first offered clinically in the UK through the 100,000 Genomes Project (100KGP). Analysis was restricted to predefined gene panels associated with the patient's phenotype. However, panels rely on clearly characterised phenotypes and risk missing diagnoses outside of the panel(s) applied. We propose a complementary method to rapidly identify pathogenic variants, including those missed by 100KGP methods. METHODS: The Loss-of-function Observed/Expected Upper-bound Fraction (LOEUF) score quantifies gene constraint, with low scores correlated with haploinsufficiency. We applied DeNovoLOEUF, a filtering strategy to sequencing data from 13,949 rare disease trios in the 100KGP, by filtering for rare, de novo, loss-of-function variants in disease genes with a LOEUF score < 0.2. We compared our findings with the corresponding patient's diagnostic reports. RESULTS: 324/332 (98%) of the variants identified using DeNovoLOEUF were diagnostic or partially diagnostic (whereby the variant was responsible for some of the phenotype). We identified 39 diagnoses that were "missed" by 100KGP standard analyses, which are now being returned to patients. CONCLUSION: We have demonstrated a highly specific and rapid method with a 98% positive predictive value that has good concordance with standard analysis, low false-positive rate, and can identify additional diagnoses. Globally, as more patients are being offered genome sequencing, we anticipate that DeNovoLOEUF will rapidly identify new diagnoses and facilitate iterative analyses when new disease genes are discovered.


Subject(s)
Genome , Phenotype , Whole Genome Sequencing/methods
18.
Genome Res ; 30(1): 62-71, 2020 01.
Article in English | MEDLINE | ID: mdl-31871067

ABSTRACT

Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene-family members, and genetic variants within these regions are potentially more likely to confer risk to disease. Here, we generated 2871 gene-family protein sequence alignments involving 9990 genes and performed missense variant burden analyses to identify novel essential protein regions. We mapped 2,219,811 variants from the general population into these alignments and compared their distribution with 76,153 missense variants from patients. With this gene-family approach, we identified 465 regions enriched for patient variants spanning 41,463 amino acids in 1252 genes. As a comparison, by testing the same genes individually, we identified fewer patient variant enriched regions, involving only 2639 amino acids and 215 genes. Next, we selected de novo variants from 6753 patients with neurodevelopmental disorders and 1911 unaffected siblings and observed an 8.33-fold enrichment of patient variants in our identified regions (95% C.I. = 3.90-Inf, P-value = 2.72 × 10-11). By using the complete ClinVar variant set, we found that missense variants inside the identified regions are 106-fold more likely to be classified as pathogenic in comparison to benign classification (OR = 106.15, 95% C.I = 70.66-Inf, P-value < 2.2 × 10-16). All pathogenic variant enriched regions (PERs) identified are available online through "PER viewer," a user-friendly online platform for interactive data mining, visualization, and download. In summary, our gene-family burden analysis approach identified novel PERs in protein sequences. This annotation can empower variant interpretation.


Subject(s)
Chromosome Mapping , Genetic Predisposition to Disease , Genetic Variation , Multigene Family , Alleles , Amino Acid Sequence , Amino Acid Substitution , Computational Biology/methods , Female , Genome-Wide Association Study , Humans , Male , Mutation, Missense , Software , User-Computer Interface
19.
Genet Med ; 25(7): 100839, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37057675

ABSTRACT

PURPOSE: LHX2 encodes the LIM homeobox 2 transcription factor (LHX2), which is highly expressed in brain and well conserved across species, but it has not been clearly linked to neurodevelopmental disorders (NDDs) to date. METHODS: Through international collaboration, we identified 19 individuals from 18 families with variable neurodevelopmental phenotypes, carrying a small chromosomal deletion, likely gene-disrupting or missense variants in LHX2. Functional consequences of missense variants were investigated in cellular systems. RESULTS: Affected individuals presented with developmental and/or behavioral abnormalities, autism spectrum disorder, variable intellectual disability, and microcephaly. We observed nucleolar accumulation for 2 missense variants located within the DNA-binding HOX domain, impaired interaction with co-factor LDB1 for another variant located in the protein-protein interaction-mediating LIM domain, and impaired transcriptional activation by luciferase assay for 4 missense variants. CONCLUSION: We implicate LHX2 haploinsufficiency by deletion and likely gene-disrupting variants as causative for a variable NDD. Our findings suggest a loss-of-function mechanism also for likely pathogenic LHX2 missense variants. Together, our observations underscore the importance of LHX2 in the nervous system and for variable neurodevelopmental phenotypes.


Subject(s)
Autism Spectrum Disorder , Intellectual Disability , Neurodevelopmental Disorders , Humans , LIM-Homeodomain Proteins/genetics , Autism Spectrum Disorder/genetics , Haploinsufficiency/genetics , Neurodevelopmental Disorders/pathology , Transcription Factors/genetics , Intellectual Disability/genetics , Intellectual Disability/complications
20.
Acta Neuropathol ; 145(4): 479-496, 2023 04.
Article in English | MEDLINE | ID: mdl-36799992

ABSTRACT

DTNA encodes α-dystrobrevin, a component of the macromolecular dystrophin-glycoprotein complex (DGC) that binds to dystrophin/utrophin and α-syntrophin. Mice lacking α-dystrobrevin have a muscular dystrophy phenotype, but variants in DTNA have not previously been associated with human skeletal muscle disease. We present 12 individuals from four unrelated families with two different monoallelic DTNA variants affecting the coiled-coil domain of α-dystrobrevin. The five affected individuals from family A harbor a c.1585G > A; p.Glu529Lys variant, while the recurrent c.1567_1587del; p.Gln523_Glu529del DTNA variant was identified in the other three families (family B: four affected individuals, family C: one affected individual, and family D: two affected individuals). Myalgia and exercise intolerance, with variable ages of onset, were reported in 10 of 12 affected individuals. Proximal lower limb weakness with onset in the first decade of life was noted in three individuals. Persistent elevations of serum creatine kinase (CK) levels were detected in 11 of 12 affected individuals, 1 of whom had an episode of rhabdomyolysis at 20 years of age. Autism spectrum disorder or learning disabilities were reported in four individuals with the c.1567_1587 deletion. Muscle biopsies in eight affected individuals showed mixed myopathic and dystrophic findings, characterized by fiber size variability, internalized nuclei, and slightly increased extracellular connective tissue and inflammation. Immunofluorescence analysis of biopsies from five affected individuals showed reduced α-dystrobrevin immunoreactivity and variably reduced immunoreactivity of other DGC proteins: dystrophin, α, ß, δ and γ-sarcoglycans, and α and ß-dystroglycans. The DTNA deletion disrupted an interaction between α-dystrobrevin and syntrophin. Specific variants in the coiled-coil domain of DTNA cause skeletal muscle disease with variable penetrance. Affected individuals show a spectrum of clinical manifestations, with severity ranging from hyperCKemia, myalgias, and exercise intolerance to childhood-onset proximal muscle weakness. Our findings expand the molecular etiologies of both muscular dystrophy and paucisymptomatic hyperCKemia, to now include monoallelic DTNA variants as a novel cause of skeletal muscle disease in humans.


Subject(s)
Autism Spectrum Disorder , Muscular Dystrophies , Neuropeptides , Mice , Humans , Animals , Child , Dystrophin/genetics , Dystrophin/metabolism , Autism Spectrum Disorder/metabolism , Muscular Dystrophies/metabolism , Dystroglycans/metabolism , Alternative Splicing , Muscle, Skeletal/pathology , Neuropeptides/genetics , Neuropeptides/metabolism , Dystrophin-Associated Proteins/genetics , Dystrophin-Associated Proteins/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL