RESUMEN
CHASERR encodes a human long noncoding RNA (lncRNA) adjacent to CHD2, a coding gene in which de novo loss-of-function variants cause developmental and epileptic encephalopathy. Here, we report our findings in three unrelated children with a syndromic, early-onset neurodevelopmental disorder, each of whom had a de novo deletion in the CHASERR locus. The children had severe encephalopathy, shared facial dysmorphisms, cortical atrophy, and cerebral hypomyelination - a phenotype that is distinct from the phenotypes of patients with CHD2 haploinsufficiency. We found that the CHASERR deletion results in increased CHD2 protein abundance in patient-derived cell lines and increased expression of the CHD2 transcript in cis. These findings indicate that CHD2 has bidirectional dosage sensitivity in human disease, and we recommend that other lncRNA-encoding genes be evaluated, particularly those upstream of genes associated with mendelian disorders. (Funded by the National Human Genome Research Institute and others.).
Asunto(s)
Trastornos del Neurodesarrollo , ARN Largo no Codificante , Preescolar , Femenino , Humanos , Lactante , Masculino , Encéfalo/patología , Encéfalo/diagnóstico por imagen , Encéfalo/metabolismo , Proteínas de Unión al ADN/análisis , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Eliminación de Gen , Haploinsuficiencia , Trastornos del Neurodesarrollo/diagnóstico , Trastornos del Neurodesarrollo/genética , Trastornos del Neurodesarrollo/patología , Fenotipo , ARN Largo no Codificante/genética , Eliminación de SecuenciaRESUMEN
BACKGROUND: Genetic variants that cause rare disorders may remain elusive even after expansive testing, such as exome sequencing. The diagnostic yield of genome sequencing, particularly after a negative evaluation, remains poorly defined. METHODS: We sequenced and analyzed the genomes of families with diverse phenotypes who were suspected to have a rare monogenic disease and for whom genetic testing had not revealed a diagnosis, as well as the genomes of a replication cohort at an independent clinical center. RESULTS: We sequenced the genomes of 822 families (744 in the initial cohort and 78 in the replication cohort) and made a molecular diagnosis in 218 of 744 families (29.3%). Of the 218 families, 61 (28.0%) - 8.2% of families in the initial cohort - had variants that required genome sequencing for identification, including coding variants, intronic variants, small structural variants, copy-neutral inversions, complex rearrangements, and tandem repeat expansions. Most families in which a molecular diagnosis was made after previous nondiagnostic exome sequencing (63.5%) had variants that could be detected by reanalysis of the exome-sequence data (53.4%) or by additional analytic methods, such as copy-number variant calling, to exome-sequence data (10.8%). We obtained similar results in the replication cohort: in 33% of the families in which a molecular diagnosis was made, or 8% of the cohort, genome sequencing was required, which showed the applicability of these findings to both research and clinical environments. CONCLUSIONS: The diagnostic yield of genome sequencing in a large, diverse research cohort and in a small clinical cohort of persons who had previously undergone genetic testing was approximately 8% and included several types of pathogenic variation that had not previously been detected by means of exome sequencing or other techniques. (Funded by the National Human Genome Research Institute and others.).
Asunto(s)
Variación Genética , Enfermedades Raras , Secuenciación Completa del Genoma , Femenino , Humanos , Masculino , Estudios de Cohortes , Exoma , Secuenciación del Exoma , Enfermedades Genéticas Congénitas/diagnóstico , Enfermedades Genéticas Congénitas/etnología , Enfermedades Genéticas Congénitas/genética , Pruebas Genéticas , Genoma Humano , Fenotipo , Enfermedades Raras/diagnóstico , Enfermedades Raras/etnología , Enfermedades Raras/genética , Análisis de Secuencia de ADN , Niño , Adolescente , Adulto Joven , AdultoRESUMEN
Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.
Asunto(s)
Trastorno del Espectro Autista , Femenino , Embarazo , Humanos , Trastorno del Espectro Autista/diagnóstico , Trastorno del Espectro Autista/genética , Primer Trimestre del Embarazo , Ultrasonografía Prenatal , Mapeo Cromosómico , ExomaRESUMEN
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
Asunto(s)
Exoma/genética , Genes Esenciales/genética , Variación Genética/genética , Genoma Humano/genética , Adulto , Encéfalo/metabolismo , Enfermedades Cardiovasculares/genética , Estudios de Cohortes , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Humanos , Mutación con Pérdida de Función/genética , Masculino , Tasa de Mutación , Proproteína Convertasa 9/genética , ARN Mensajero/genética , Reproducibilidad de los Resultados , Secuenciación del Exoma , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.
Asunto(s)
Enfermedades Raras , Humanos , Enfermedades Raras/genética , Enfermedades Raras/diagnóstico , Genoma Humano/genética , Variación Genética/genética , Biología Computacional/métodos , FenotipoRESUMEN
Virtually all genome sequencing efforts in national biobanks, complex and Mendelian disease programs, and medical genetic initiatives are reliant upon short-read whole-genome sequencing (srWGS), which presents challenges for the detection of structural variants (SVs) relative to emerging long-read WGS (lrWGS) technologies. Given this ubiquity of srWGS in large-scale genomics initiatives, we sought to establish expectations for routine SV detection from this data type by comparison with lrWGS assembly, as well as to quantify the genomic properties and added value of SVs uniquely accessible to each technology. Analyses from the Human Genome Structural Variation Consortium (HGSVC) of three families captured ~11,000 SVs per genome from srWGS and ~25,000 SVs per genome from lrWGS assembly. Detection power and precision for SV discovery varied dramatically by genomic context and variant class: 9.7% of the current GRCh38 reference is defined by segmental duplication (SD) and simple repeat (SR), yet 91.4% of deletions that were specifically discovered by lrWGS localized to these regions. Across the remaining 90.3% of reference sequence, we observed extremely high (93.8%) concordance between technologies for deletions in these datasets. In contrast, lrWGS was superior for detection of insertions across all genomic contexts. Given that non-SD/SR sequences encompass 95.9% of currently annotated disease-associated exons, improved sensitivity from lrWGS to discover novel pathogenic deletions in these currently interpretable genomic regions is likely to be incremental. However, these analyses highlight the considerable added value of assembly-based lrWGS to create new catalogs of insertions and transposable elements, as well as disease-associated repeat expansions in genomic sequences that were previously recalcitrant to routine assessment.
Asunto(s)
Genoma Humano/genética , Variación Estructural del Genoma , Genómica/métodos , Objetivos , Secuenciación Completa del Genoma/métodos , Secuenciación Completa del Genoma/normas , Variaciones en el Número de Copia de ADN , Exones/genética , Humanos , Proyectos de Investigación , Duplicaciones Segmentarias en el Genoma , Alineación de SecuenciaRESUMEN
JAG2 encodes the Notch ligand Jagged2. The conserved Notch signaling pathway contributes to the development and homeostasis of multiple tissues, including skeletal muscle. We studied an international cohort of 23 individuals with genetically unsolved muscular dystrophy from 13 unrelated families. Whole-exome sequencing identified rare homozygous or compound heterozygous JAG2 variants in all 13 families. The identified bi-allelic variants include 10 missense variants that disrupt highly conserved amino acids, a nonsense variant, two frameshift variants, an in-frame deletion, and a microdeletion encompassing JAG2. Onset of muscle weakness occurred from infancy to young adulthood. Serum creatine kinase (CK) levels were normal or mildly elevated. Muscle histology was primarily dystrophic. MRI of the lower extremities revealed a distinct, slightly asymmetric pattern of muscle involvement with cores of preserved and affected muscles in quadriceps and tibialis anterior, in some cases resembling patterns seen in POGLUT1-associated muscular dystrophy. Transcriptome analysis of muscle tissue from two participants suggested misregulation of genes involved in myogenesis, including PAX7. In complementary studies, Jag2 downregulation in murine myoblasts led to downregulation of multiple components of the Notch pathway, including Megf10. Investigations in Drosophila suggested an interaction between Serrate and Drpr, the fly orthologs of JAG1/JAG2 and MEGF10, respectively. In silico analysis predicted that many Jagged2 missense variants are associated with structural changes and protein misfolding. In summary, we describe a muscular dystrophy associated with pathogenic variants in JAG2 and evidence suggests a disease mechanism related to Notch pathway dysfunction.
Asunto(s)
Proteína Jagged-2/genética , Distrofias Musculares/genética , Adolescente , Adulto , Secuencia de Aminoácidos , Animales , Línea Celular , Niño , Preescolar , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Femenino , Glucosiltransferasas/genética , Haplotipos/genética , Humanos , Proteína Jagged-1/genética , Proteína Jagged-2/química , Proteína Jagged-2/deficiencia , Proteína Jagged-2/metabolismo , Masculino , Proteínas de la Membrana/genética , Ratones , Persona de Mediana Edad , Modelos Moleculares , Músculos/metabolismo , Músculos/patología , Distrofias Musculares/patología , Mioblastos/metabolismo , Mioblastos/patología , Linaje , Fenotipo , Receptores Notch/metabolismo , Transducción de Señal , Secuenciación del Exoma , Adulto JovenRESUMEN
PURPOSE: To identify genetic etiologies and genotype/phenotype associations for unsolved ocular congenital cranial dysinnervation disorders (oCCDDs). METHODS: We coupled phenotyping with exome or genome sequencing of 467 probands (550 affected and 1108 total individuals) with genetically unsolved oCCDDs, integrating analyses of pedigrees, human and animal model phenotypes, and de novo variants to identify rare candidate single nucleotide variants, insertion/deletions, and structural variants disrupting protein-coding regions. Prioritized variants were classified for pathogenicity and evaluated for genotype/phenotype correlations. RESULTS: Analyses elucidated phenotypic subgroups, identified pathogenic/likely pathogenic variant(s) in 43/467 probands (9.2%), and prioritized variants of uncertain significance in 70/467 additional probands (15.0%). These included known and novel variants in established oCCDD genes, genes associated with syndromes that sometimes include oCCDDs (e.g., MYH10, KIF21B, TGFBR2, TUBB6), genes that fit the syndromic component of the phenotype but had no prior oCCDD association (e.g., CDK13, TGFB2), genes with no reported association with oCCDDs or the syndromic phenotypes (e.g., TUBA4A, KIF5C, CTNNA1, KLB, FGF21), and genes associated with oCCDD phenocopies that had resulted in misdiagnoses. CONCLUSION: This study suggests that unsolved oCCDDs are clinically and genetically heterogeneous disorders often overlapping other Mendelian conditions and nominates many candidates for future replication and functional studies.
RESUMEN
PURPOSE: Genome sequencing (GS)-specific diagnostic rates in prospective tightly ascertained exome sequencing (ES)-negative intellectual disability (ID) cohorts have not been reported extensively. METHODS: ES, GS, epigenetic signatures, and long-read sequencing diagnoses were assessed in 74 trios with at least moderate ID. RESULTS: The ES diagnostic yield was 42 of 74 (57%). GS diagnoses were made in 9 of 32 (28%) ES-unresolved families. Repeated ES with a contemporary pipeline on the GS-diagnosed families identified 8 of 9 single-nucleotide variations/copy-number variations undetected in older ES, confirming a GS-unique diagnostic rate of 1 in 32 (3%). Episignatures contributed diagnostic information in 9% with GS corroboration in 1 of 32 (3%) and diagnostic clues in 2 of 32 (6%). A genetic etiology for ID was detected in 51 of 74 (69%) families. Twelve candidate disease genes were identified. Contemporary ES followed by GS cost US$4976 (95% CI: $3704; $6969) per diagnosis and first-line GS at a cost of $7062 (95% CI: $6210; $8475) per diagnosis. CONCLUSION: Performing GS only in ID trios would be cost equivalent to ES if GS were available at $2435, about a 60% reduction from current prices. This study demonstrates that first-line GS achieves higher diagnostic rate than contemporary ES but at a higher cost.
Asunto(s)
Secuenciación del Exoma , Exoma , Discapacidad Intelectual , Humanos , Discapacidad Intelectual/genética , Discapacidad Intelectual/diagnóstico , Masculino , Femenino , Exoma/genética , Secuenciación del Exoma/economía , Estudios de Cohortes , Pruebas Genéticas/economía , Pruebas Genéticas/métodos , Secuenciación Completa del Genoma/economía , Niño , Genoma Humano/genética , Variaciones en el Número de Copia de ADN/genética , Polimorfismo de Nucleótido Simple/genética , PreescolarRESUMEN
CAG repeat expansions in exon 1 of the AR gene on the X chromosome cause spinal and bulbar muscular atrophy, a male-specific progressive neuromuscular disorder associated with a variety of extra-neurological symptoms. The disease has a reported male prevalence of approximately 1:30 000 or less, but the AR repeat expansion frequency is unknown. We established a pipeline, which combines the use of the ExpansionHunter tool and visual validation, to detect AR CAG expansion on whole-genome sequencing data, benchmarked it to fragment PCR sizing, and applied it to 74 277 unrelated individuals from four large cohorts. Our pipeline showed sensitivity of 100% [95% confidence interval (CI) 90.8-100%], specificity of 99% (95% CI 94.2-99.7%), and a positive predictive value of 97.4% (95% CI 84.4-99.6%). We found the mutation frequency to be 1:3182 (95% CI 1:2309-1:4386, n = 117 734) X chromosomes-10 times more frequent than the reported disease prevalence. Modelling using the novel mutation frequency led to estimate disease prevalence of 1:6887 males, more than four times more frequent than the reported disease prevalence. This discrepancy is possibly due to underdiagnosis of this neuromuscular condition, reduced penetrance, and/or pleomorphic clinical manifestations.
Asunto(s)
Atrofia Muscular Espinal , Receptores Androgénicos , Humanos , Masculino , Receptores Androgénicos/genética , Atrofia Muscular Espinal/genética , Atrofia Muscular , Reacción en Cadena de la Polimerasa , Expansión de Repetición de Trinucleótido/genéticaRESUMEN
Exome and genome sequencing have become the tools of choice for rare disease diagnosis, leading to large amounts of data available for analyses. To identify causal variants in these datasets, powerful filtering and decision support tools that can be efficiently used by clinicians and researchers are required. To address this need, we developed seqr - an open-source, web-based tool for family-based monogenic disease analysis that allows researchers to work collaboratively to search and annotate genomic callsets. To date, seqr is being used in several research pipelines and one clinical diagnostic lab. In our own experience through the Broad Institute Center for Mendelian Genomics, seqr has enabled analyses of over 10,000 families, supporting the diagnosis of more than 3,800 individuals with rare disease and discovery of over 300 novel disease genes. Here, we describe a framework for genomic analysis in rare disease that leverages seqr's capabilities for variant filtration, annotation, and causal variant identification, as well as support for research collaboration and data sharing. The seqr platform is available as open source software, allowing low-cost participation in rare disease research, and a community effort to support diagnosis and gene discovery in rare disease.
Asunto(s)
Genómica , Enfermedades Raras , Exoma , Humanos , Internet , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Programas InformáticosRESUMEN
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
Asunto(s)
Exoma/genética , Variación Genética/genética , Análisis Mutacional de ADN , Conjuntos de Datos como Asunto , Humanos , Fenotipo , Proteoma/genética , Enfermedades Raras/genética , Tamaño de la MuestraRESUMEN
[This corrects the article DOI: 10.1371/journal.pgen.1007329.].
RESUMEN
Cerebellar ataxia, neuropathy and vestibular areflexia syndrome (CANVAS) is a progressive late-onset, neurological disease. Recently, a pentanucleotide expansion in intron 2 of RFC1 was identified as the genetic cause of CANVAS. We screened an Asian-Pacific cohort for CANVAS and identified a novel RFC1 repeat expansion motif, (ACAGG)exp, in three affected individuals. This motif was associated with additional clinical features including fasciculations and elevated serum creatine kinase. These features have not previously been described in individuals with genetically-confirmed CANVAS. Haplotype analysis showed our patients shared the same core haplotype as previously published, supporting the possibility of a single origin of the RFC1 disease allele. We analysed data from >26 000 genetically diverse individuals in gnomAD to show enrichment of (ACAGG) in non-European populations.
Asunto(s)
Pueblo Asiatico/genética , Vestibulopatía Bilateral/genética , Ataxia Cerebelosa/genética , Expansión de las Repeticiones de ADN/genética , Proteína de Replicación C/genética , Anciano , Vestibulopatía Bilateral/complicaciones , Vestibulopatía Bilateral/diagnóstico , Ataxia Cerebelosa/complicaciones , Ataxia Cerebelosa/diagnóstico , Estudios de Cohortes , Femenino , Humanos , Indonesia , Masculino , Persona de Mediana Edad , LinajeRESUMEN
As part of a broader collaborative network of exome sequencing studies, we developed a jointly called data set of 5,685 Ashkenazi Jewish exomes. We make publicly available a resource of site and allele frequencies, which should serve as a reference for medical genetics in the Ashkenazim (hosted in part at https://ibd.broadinstitute.org, also available in gnomAD at http://gnomad.broadinstitute.org). We estimate that 34% of protein-coding alleles present in the Ashkenazi Jewish population at frequencies greater than 0.2% are significantly more frequent (mean 15-fold) than their maximum frequency observed in other reference populations. Arising via a well-described founder effect approximately 30 generations ago, this catalog of enriched alleles can contribute to differences in genetic risk and overall prevalence of diseases between populations. As validation we document 148 AJ enriched protein-altering alleles that overlap with "pathogenic" ClinVar alleles (table available at https://github.com/macarthur-lab/clinvar/blob/master/output/clinvar.tsv), including those that account for 10-100 fold differences in prevalence between AJ and non-AJ populations of some rare diseases, especially recessive conditions, including Gaucher disease (GBA, p.Asn409Ser, 8-fold enrichment); Canavan disease (ASPA, p.Glu285Ala, 12-fold enrichment); and Tay-Sachs disease (HEXA, c.1421+1G>C, 27-fold enrichment; p.Tyr427IlefsTer5, 12-fold enrichment). We next sought to use this catalog, of well-established relevance to Mendelian disease, to explore Crohn's disease, a common disease with an estimated two to four-fold excess prevalence in AJ. We specifically attempt to evaluate whether strong acting rare alleles, particularly protein-truncating or otherwise large effect-size alleles, enriched by the same founder-effect, contribute excess genetic risk to Crohn's disease in AJ, and find that ten rare genetic risk factors in NOD2 and LRRK2 are enriched in AJ (p < 0.005), including several novel contributing alleles, show evidence of association to CD. Independently, we find that genomewide common variant risk defined by GWAS shows a strong difference between AJ and non-AJ European control population samples (0.97 s.d. higher, p<10-16). Taken together, the results suggest coordinated selection in AJ population for higher CD risk alleles in general. The results and approach illustrate the value of exome sequencing data in case-control studies along with reference data sets like ExAC (sites VCF available via FTP at ftp.broadinstitute.org/pub/ExAC_release/release0.3/) to pinpoint genetic variation that contributes to variable disease predisposition across populations.
Asunto(s)
Enfermedad de Crohn/genética , Predisposición Genética a la Enfermedad/genética , Judíos/genética , Enfermedades Raras/genética , Algoritmos , Enfermedad de Crohn/epidemiología , Genética de Población , Estudio de Asociación del Genoma Completo , Haplotipos , Humanos , Modelos Genéticos , Epidemiología Molecular , Polimorfismo de Nucleótido Simple , Enfermedades Raras/epidemiologíaRESUMEN
We present eight families with arthrogryposis multiplex congenita and myopathy bearing a TTN intron 213 extended splice-site variant (NM_001267550.1:c.39974-11T>G), inherited in trans with a second pathogenic TTN variant. Muscle-derived RNA studies of three individuals confirmed mis-splicing induced by the c.39974-11T>G variant; in-frame exon 214 skipping or use of a cryptic 3' splice-site effecting a frameshift. Confounding interpretation of pathogenicity is the absence of exons 213-217 within the described skeletal muscle TTN N2A isoform. However, RNA-sequencing from 365 adult human gastrocnemius samples revealed that 56% specimens predominantly include exons 213-217 in TTN transcripts (inclusion rate ≥66%). Further, RNA-sequencing of five fetal muscle samples confirmed that 4/5 specimens predominantly include exons 213-217 (fifth sample inclusion rate 57%). Contractures improved significantly with age for four individuals, which may be linked to decreased expression of pathogenic fetal transcripts. Our study extends emerging evidence supporting a vital developmental role for TTN isoforms containing metatranscript-only exons.
Asunto(s)
Empalme Alternativo , Artrogriposis/diagnóstico , Artrogriposis/genética , Conectina/genética , Genes Recesivos , Predisposición Genética a la Enfermedad , Enfermedades Musculares/diagnóstico , Enfermedades Musculares/genética , Niño , Preescolar , Femenino , Estudios de Asociación Genética , Humanos , Lactante , Masculino , Mutación , Linaje , Fenotipo , RadiografíaRESUMEN
MOTIVATION: The correct classification of missense variants as benign or pathogenic remains challenging. Pathogenic variants are expected to have higher deleterious prediction scores than benign variants in the same gene. However, most of the existing variant annotation tools do not reference the score range of benign population variants on gene level. RESULTS: We present a web-application, Variant Score Ranker, which enables users to rapidly annotate variants and perform gene-specific variant score ranking on the population level. We also provide an intuitive example of how gene- and population-calibrated variant ranking scores can improve epilepsy variant prioritization. AVAILABILITY AND IMPLEMENTATION: http://vsranker.broadinstitute.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.