RESUMO
Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are "variants of uncertain significance" (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆GFold) for all DVD missense variants. We find that 5772 VUSs have a large, destabilizing ∆∆GFold that is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3456 VUSs are likely pathogenic at a probability of 99.0%. Of the 224 genes in the DVD, 166 genes (74%) exhibit one or more missense variants predicted to cause a pathogenic change in protein folding stability. The VUSs prioritized here affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.
Assuntos
Surdez , Perda Auditiva , Humanos , Proteoma/genética , Perda Auditiva/genética , Mutação de Sentido Incorreto , Surdez/genéticaRESUMO
The classification of genetic variants represents a major challenge in the post-genome era by virtue of their extraordinary number and the complexities associated with ascribing a clinical impact, especially for disorders exhibiting exceptional phenotypic, genetic, and allelic heterogeneity. To address this challenge for hearing loss, we have developed the Deafness Variation Database (DVD), a comprehensive, open-access resource that integrates all available genetic, genomic, and clinical data together with expert curation to generate a single classification for each variant in 152 genes implicated in syndromic and non-syndromic deafness. We evaluate 876,139 variants and classify them as pathogenic or likely pathogenic (more than 8,100 variants), benign or likely benign (more than 172,000 variants), or of uncertain significance (more than 695,000 variants); 1,270 variants are re-categorized based on expert curation and in 300 instances, the change is of medical significance and impacts clinical care. We show that more than 96% of coding variants are rare and novel and that pathogenicity is driven by minor allele frequency thresholds, variant effect, and protein domain. The mutational landscape we define shows complex gene-specific variability, making an understanding of these nuances foundational for improved accuracy in variant interpretation in order to enhance clinical decision making and improve our understanding of deafness biology.
Assuntos
Surdez/genética , Mutação/genética , Bases de Dados Genéticas , Frequência do Gene/genética , Genômica/métodos , Perda Auditiva/genética , HumanosRESUMO
We present detailed comparative analyses to assess population-level differences in patterns of genetic deafness between European/American and Japanese cohorts with non-syndromic hearing loss. One thousand eighty-three audiometric test results (921 European/American and 162 Japanese) from members of 168 families (48 European/American and 120 Japanese) with non-syndromic hearing loss secondary to pathogenic variants in one of three genes (KCNQ4, TECTA, WFS1) were studied. Audioprofile characteristics, specific mutation types, and protein domains were considered in the comparative analyses. Our findings support differences in audioprofiles driven by both mutation type (non-truncating vs. truncating) and ethnic background. The former finding confirms data that ascribe a phenotypic consequence to different mutation types in KCNQ4; the latter finding suggests that there are ethnic-specific effects (genetic and/or environmental) that impact gene-specific audioprofiles for TECTA and WFS1. Identifying the drivers of ethnic differences will refine our understanding of phenotype-genotype relationships and the biology of hearing and deafness.
Assuntos
Proteínas da Matriz Extracelular/genética , Genótipo , Perda Auditiva Neurossensorial/genética , Canais de Potássio KCNQ/genética , Proteínas de Membrana/genética , Mutação , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Povo Asiático , Audiometria , Estudos de Casos e Controles , Criança , Pré-Escolar , Feminino , Proteínas Ligadas por GPI/genética , Expressão Gênica , Estudos de Associação Genética , Perda Auditiva Neurossensorial/diagnóstico , Perda Auditiva Neurossensorial/etnologia , Perda Auditiva Neurossensorial/fisiopatologia , Humanos , Lactente , Recém-Nascido , Japão , Masculino , Pessoa de Meia-Idade , Linhagem , Fenótipo , Estados Unidos , População BrancaRESUMO
Hearing loss is associated with â¼8100 mutations in 152 genes, and within the coding regions of these genes are over 60,000 missense variants. The majority of these variants are classified as "variants of uncertain significance" to reflect our inability to ascribe a phenotypic effect to the observed amino acid change. A promising source of pathogenicity information is biophysical simulation, although input protein structures often contain defects because of limitations in experimental data and/or only distant homology to a template. Here, we combine the polarizable atomic multipole optimized energetics for biomolecular applications force field, many-body optimization theory, and graphical processing unit acceleration to repack all deafness-associated proteins and thereby improve average structure MolProbity score from 2.2 to 1.0. We then used these optimized wild-type models to create over 60,000 structures for missense variants in the Deafness Variation Database, which are being incorporated into the Deafness Variation Database to inform deafness pathogenicity prediction. Finally, this work demonstrates that advanced polarizable atomic multipole force fields are efficient enough to repack the entire human proteome.
Assuntos
Algoritmos , Perda Auditiva/genética , Proteínas/química , Fenômenos Biofísicos , Bases de Dados de Proteínas , Humanos , Modelos MolecularesRESUMO
Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6,328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are "variants of uncertain significance" (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆G Fold ) for all DVD missense variants. We find that 5,772 VUSs have a large, destabilizing ∆∆G Fold that is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3,456 VUSs are likely pathogenic at a probability of 99.0%. These VUSs affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.