Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
medRxiv ; 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38585811

RESUMO

Purpose: To identify genetic etiologies and genotype/phenotype associations for unsolved ocular congenital cranial dysinnervation disorders (oCCDDs). Methods: We coupled phenotyping with exome or genome sequencing of 467 pedigrees with genetically unsolved oCCDDs, integrating analyses of pedigrees, human and animal model phenotypes, and de novo variants to identify rare candidate single nucleotide variants, insertion/deletions, and structural variants disrupting protein-coding regions. Prioritized variants were classified for pathogenicity and evaluated for genotype/phenotype correlations. Results: Analyses elucidated phenotypic subgroups, identified pathogenic/likely pathogenic variant(s) in 43/467 probands (9.2%), and prioritized variants of uncertain significance in 70/467 additional probands (15.0%). These included known and novel variants in established oCCDD genes, genes associated with syndromes that sometimes include oCCDDs (e.g., MYH10, KIF21B, TGFBR2, TUBB6), genes that fit the syndromic component of the phenotype but had no prior oCCDD association (e.g., CDK13, TGFB2), genes with no reported association with oCCDDs or the syndromic phenotypes (e.g., TUBA4A, KIF5C, CTNNA1, KLB, FGF21), and genes associated with oCCDD phenocopies that had resulted in misdiagnoses. Conclusion: This study suggests that unsolved oCCDDs are clinically and genetically heterogeneous disorders often overlapping other Mendelian conditions and nominates many candidates for future replication and functional studies.

2.
Hum Genomics ; 18(1): 44, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685113

RESUMO

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.


Assuntos
Doenças Raras , Humanos , Doenças Raras/genética , Doenças Raras/diagnóstico , Genoma Humano/genética , Variação Genética/genética , Biologia Computacional/métodos , Fenótipo
3.
bioRxiv ; 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38645134

RESUMO

Missense variants can have a range of functional impacts depending on factors such as the specific amino acid substitution and location within the gene. To interpret their deleteriousness, studies have sought to identify regions within genes that are specifically intolerant of missense variation 1-12 . Here, we leverage the patterns of rare missense variation in 125,748 individuals in the Genome Aggregation Database (gnomAD) 13 against a null mutational model to identify transcripts that display regional differences in missense constraint. Missense-depleted regions are enriched for ClinVar 14 pathogenic variants, de novo missense variants from individuals with neurodevelopmental disorders (NDDs) 15,16 , and complex trait heritability. Following ClinGen calibration recommendations for the ACMG/AMP guidelines, we establish that regions with less than 20% of their expected missense variation achieve moderate support for pathogenicity. We create a missense deleteriousness metric (MPC) that incorporates regional constraint and outperforms other deleteriousness scores at stratifying case and control de novo missense variation, with a strong enrichment in NDDs. These results provide additional tools to aid in missense variant interpretation.

4.
Ann Clin Transl Neurol ; 11(5): 1250-1266, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38544359

RESUMO

OBJECTIVE: Most families with heritable neuromuscular disorders do not receive a molecular diagnosis. Here we evaluate diagnostic utility of exome, genome, RNA sequencing, and protein studies and provide evidence-based recommendations for their integration into practice. METHODS: In total, 247 families with suspected monogenic neuromuscular disorders who remained without a genetic diagnosis after standard diagnostic investigations underwent research-led massively parallel sequencing: neuromuscular disorder gene panel, exome, genome, and/or RNA sequencing to identify causal variants. Protein and RNA studies were also deployed when required. RESULTS: Integration of exome sequencing and auxiliary genome, RNA and/or protein studies identified causal or likely causal variants in 62% (152 out of 247) of families. Exome sequencing alone informed 55% (83 out of 152) of diagnoses, with remaining diagnoses (45%; 69 out of 152) requiring genome sequencing, RNA and/or protein studies to identify variants and/or support pathogenicity. Arrestingly, novel disease genes accounted for <4% (6 out of 152) of diagnoses while 36.2% of solved families (55 out of 152) harbored at least one splice-altering or structural variant in a known neuromuscular disorder gene. We posit that contemporary neuromuscular disorder gene-panel sequencing could likely provide 66% (100 out of 152) of our diagnoses today. INTERPRETATION: Our results emphasize thorough clinical phenotyping to enable deep scrutiny of all rare genetic variation in phenotypically consistent genes. Post-exome auxiliary investigations extended our diagnostic yield by 81% overall (34-62%). We present a diagnostic algorithm that details deployment of genomic and auxiliary investigations to obtain these diagnoses today most effectively. We hope this provides a practical guide for clinicians as they gain greater access to clinical genome and transcriptome sequencing.


Assuntos
Sequenciamento do Exoma , Doenças Neuromusculares , Humanos , Doenças Neuromusculares/genética , Doenças Neuromusculares/diagnóstico , Masculino , Feminino , Adulto , Análise de Sequência de RNA/métodos , Criança , Adolescente , Exoma/genética , Pessoa de Meia-Idade , Adulto Jovem , Pré-Escolar , Sequenciamento de Nucleotídeos em Larga Escala , Lactente , Testes Genéticos/métodos
5.
medRxiv ; 2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38496558

RESUMO

Genes encoding long non-coding RNAs (lncRNAs) comprise a large fraction of the human genome, yet haploinsufficiency of a lncRNA has not been shown to cause a Mendelian disease. CHASERR is a highly conserved human lncRNA adjacent to CHD2-a coding gene in which de novo loss-of-function variants cause developmental and epileptic encephalopathy. Here we report three unrelated individuals each harboring an ultra-rare heterozygous de novo deletion in the CHASERR locus. We report similarities in severe developmental delay, facial dysmorphisms, and cerebral dysmyelination in these individuals, distinguishing them from the phenotypic spectrum of CHD2 haploinsufficiency. We demonstrate reduced CHASERR mRNA expression and corresponding increased CHD2 mRNA and protein in whole blood and patient-derived cell lines-specifically increased expression of the CHD2 allele in cis with the CHASERR deletion, as predicted from a prior mouse model of Chaserr haploinsufficiency. We show for the first time that de novo structural variants facilitated by Alu-mediated non-allelic homologous recombination led to deletion of a non-coding element (the lncRNA CHASERR) to cause a rare syndromic neurodevelopmental disorder. We also demonstrate that CHD2 has bidirectional dosage sensitivity in human disease. This work highlights the need to carefully evaluate other lncRNAs, particularly those upstream of genes associated with Mendelian disorders.

6.
medRxiv ; 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38405995

RESUMO

Spinal muscular atrophy (SMA) is a genetic disorder that causes progressive degeneration of lower motor neurons and the subsequent loss of muscle function throughout the body. It is the second most common recessive disorder in individuals of European descent and is present in all populations. Accurate tools exist for diagnosing SMA from short read and long read genome sequencing data. However, there are no publicly available tools for GRCh38-aligned data from panel or exome sequencing assays which continue to be used as first line tests for neuromuscular disorders. We therefore developed and extensively validated a new tool - SMA Finder - that can diagnose SMA not only in genome, but also exome and targeted sequencing samples aligned to GRCh37, GRCh38, or T2T-CHM13. It works by evaluating aligned reads that overlap the c.840 position of SMN1 and SMN2 in order to detect the most common molecular causes of SMA. We applied SMA Finder to 16,626 exomes and 3,911 genomes from heterogeneous rare disease cohorts sequenced at the Broad Institute Center for Mendelian Genomics as well as 1,157 exomes and 8,762 targeted sequencing samples from Tartu University Hospital. SMA Finder correctly identified all 16 known SMA cases and reported nine novel diagnoses which have since been confirmed by clinical testing, with another four novel diagnoses undergoing validation. Notably, out of the 29 total SMA positive cases, 21 had an initial clinical diagnosis of muscular dystrophy, congenital myasthenic syndrome, or congenital myopathy. This underscored the frequency with which SMA can be misdiagnosed as other neuromuscular disorders and confirmed the utility of using SMA Finder to reanalyze phenotypically diverse neuromuscular disease cohorts. Finally, we evaluated SMA Finder on 198,868 individuals that had both exome and genome sequencing data within the UK Biobank (UKBB) and found that SMA Finder's overall false positive rate was less than 1 / 200,000 exome samples, and its positive predictive value (PPV) was 96%. We also observed 100% concordance between UKBB exome and genome calls. This analysis showed that, even though it is located within a segmental duplication, the most common causal variant for SMA can be detected with comparable accuracy to monogenic disease variants in non-repetitive regions. Additionally, the high PPV demonstrated by SMA Finder, the existence of treatment options for SMA in which early diagnosis is imperative for therapeutic benefit, as well as widespread availability of clinical confirmatory testing for SMA, may warrant the addition of SMN1 to the ACMG list of genes with reportable secondary findings after genome and exome sequencing.

7.
Genet Med ; 26(5): 101076, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38258669

RESUMO

PURPOSE: Genome sequencing (GS)-specific diagnostic rates in prospective tightly ascertained exome sequencing (ES)-negative intellectual disability (ID) cohorts have not been reported extensively. METHODS: ES, GS, epigenetic signatures, and long-read sequencing diagnoses were assessed in 74 trios with at least moderate ID. RESULTS: The ES diagnostic yield was 42 of 74 (57%). GS diagnoses were made in 9 of 32 (28%) ES-unresolved families. Repeated ES with a contemporary pipeline on the GS-diagnosed families identified 8 of 9 single-nucleotide variations/copy-number variations undetected in older ES, confirming a GS-unique diagnostic rate of 1 in 32 (3%). Episignatures contributed diagnostic information in 9% with GS corroboration in 1 of 32 (3%) and diagnostic clues in 2 of 32 (6%). A genetic etiology for ID was detected in 51 of 74 (69%) families. Twelve candidate disease genes were identified. Contemporary ES followed by GS cost US$4976 (95% CI: $3704; $6969) per diagnosis and first-line GS at a cost of $7062 (95% CI: $6210; $8475) per diagnosis. CONCLUSION: Performing GS only in ID trios would be cost equivalent to ES if GS were available at $2435, about a 60% reduction from current prices. This study demonstrates that first-line GS achieves higher diagnostic rate than contemporary ES but at a higher cost.


Assuntos
Sequenciamento do Exoma , Exoma , Deficiência Intelectual , Humanos , Deficiência Intelectual/genética , Deficiência Intelectual/diagnóstico , Masculino , Feminino , Exoma/genética , Sequenciamento do Exoma/economia , Estudos de Coortes , Testes Genéticos/economia , Testes Genéticos/métodos , Sequenciamento Completo do Genoma/economia , Criança , Genoma Humano/genética , Variações do Número de Cópias de DNA/genética , Polimorfismo de Nucleotídeo Único/genética , Pré-Escolar
8.
medRxiv ; 2023 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-37577678

RESUMO

Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds. Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency. Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.

9.
Brain Commun ; 5(4): fcad208, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37621409

RESUMO

Cerebellar ataxia, neuropathy and vestibular areflexia syndrome is a progressive, generally late-onset, neurological disorder associated with biallelic pentanucleotide expansions in Intron 2 of the RFC1 gene. The locus exhibits substantial genetic variability, with multiple pathogenic and benign pentanucleotide repeat alleles previously identified. To determine the contribution of pathogenic RFC1 expansions to neurological disease within an Australasian cohort and further investigate the heterogeneity exhibited at the locus, a combination of flanking and repeat-primed PCR was used to screen a cohort of 242 Australasian patients with neurological disease. Patients whose data indicated large gaps within expanded alleles following repeat-primed PCR, underwent targeted long-read sequencing to identify novel repeat motifs at the locus. To increase diagnostic yield, additional probes at the RFC1 repeat region were incorporated into the PathWest diagnostic laboratory targeted neurological disease gene panel to enable first-pass screening of the locus for all samples tested on the panel. Within the Australasian cohort, we detected known pathogenic biallelic expansions in 15.3% (n = 37) of patients. Thirty indicated biallelic AAGGG expansions, two had biallelic 'Maori alleles' [(AAAGG)exp(AAGGG)exp], two samples were compound heterozygous for the Maori allele and an AAGGG expansion, two samples had biallelic ACAGG expansions and one sample was compound heterozygous for the ACAGG and AAGGG expansions. Forty-five samples tested indicated the presence of biallelic expansions not known to be pathogenic. A large proportion (84%) showed complex interrupted patterns following repeat-primed PCR, suggesting that these expansions are likely to be comprised of more than one repeat motif, including previously unknown repeats. Using targeted long-read sequencing, we identified three novel repeat motifs in expanded alleles. Here, we also show that short-read sequencing can be used to reliably screen for the presence or absence of biallelic RFC1 expansions in all samples tested using the PathWest targeted neurological disease gene panel. Our results show that RFC1 pathogenic expansions make a substantial contribution to neurological disease in the Australasian population and further extend the heterogeneity of the locus. To accommodate the increased complexity, we outline a multi-step workflow utilizing both targeted short- and long-read sequencing to achieve a definitive genotype and provide accurate diagnoses for patients.

10.
Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37595579

RESUMO

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.


Assuntos
Transtorno do Espectro Autista , Feminino , Gravidez , Humanos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Primeiro Trimestre da Gravidez , Ultrassonografia Pré-Natal , Mapeamento Cromossômico , Exoma
11.
bioRxiv ; 2023 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-37214979

RESUMO

Tools for genotyping tandem repeats (TRs) from short read sequencing data have improved significantly over the past decade. Extensive comparisons of these tools to gold standard diagnostic methods like RP-PCR have confirmed their accuracy for tens to hundreds of well-studied loci. However, a scarcity of high-quality orthogonal truth data limited our ability to measure tool accuracy for the millions of other loci throughout the genome. To address this, we developed a TR truth set based on the Synthetic Diploid Benchmark (SynDip). By identifying the subset of insertions and deletions that represent TR expansions or contractions with motifs between 2 and 50 base pairs, we obtained accurate genotypes for 139,795 pure and 6,845 interrupted repeats in a single diploid sample. Our approach did not require running existing genotyping tools on short read or long read sequencing data and provided an alternative, more accurate view of tandem repeat variation. We applied this truth set to compare the strengths and weaknesses of widely-used tools for genotyping TRs, evaluated the completeness of existing genome-wide TR catalogs, and explored the properties of tandem repeat variation throughout the genome. We found that, without filtering, ExpansionHunter had higher accuracy than GangSTR and HipSTR over a wide range of motifs and allele sizes. Also, when errors in allele size occurred, ExpansionHunter tended to overestimate expansion sizes, while GangSTR tended to underestimate them. Additionally, we saw that widely-used TR catalogs miss between 16% and 41% of variant loci in the truth set. These results suggest that genome-wide analyses would benefit from genotyping a larger set of loci as well as further tool development that builds on the strengths of current algorithms. To that end, we developed a new catalog of 2.8 million loci that captures 95% of variant loci in the truth set, and created a modified version of ExpansionHunter that runs 2 to 3x faster than the original while producing the same output.

12.
Neurol Genet ; 9(2): e200064, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37090938

RESUMO

Objective: Duchenne muscular dystrophy (DMD) is caused by pathogenic variants in the dystrophin gene (DMD). Hypermethylated CGG expansions within DIP2B 5' UTR are associated with an intellectual development disorder. Here, we demonstrate the diagnostic utility of genomic short-read sequencing (SRS) and transcriptome sequencing to identify a novel DMD structural variant (SV) and a DIP2B CGG expansion in a patient with DMD for whom conventional diagnostic testing failed to yield a genetic diagnosis. Methods: We performed genomic SRS, skeletal muscle transcriptome sequencing, and targeted programmable long-read sequencing (LRS). Results: The proband had a typical DMD clinical presentation, autism spectrum disorder (ASD), and dystrophinopathy on muscle biopsy. Transcriptome analysis identified 6 aberrantly expressed genes; DMD and DIP2B were the strongest underexpression and overexpression outliers, respectively. Genomic SRS identified a 216 kb paracentric inversion (NC_000023.11: g.33162217-33378800) overlapping 2 DMD promoters. ExpansionHunter indicated an expansion of 109 CGG repeats within the 5' UTR of DIP2B. Targeted genomic LRS confirmed the SV and genotyped the DIP2B repeat expansion as 270 CGG repeats. Discussion: Here, transcriptome data heavily guided genomic analysis to resolve a complex DMD inversion and a DIP2B repeat expansion. Longitudinal follow-up will be important for clarifying the clinical significance of the DIP2B genotype.

13.
Brain ; 146(7): 2723-2729, 2023 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-36797998

RESUMO

CAG repeat expansions in exon 1 of the AR gene on the X chromosome cause spinal and bulbar muscular atrophy, a male-specific progressive neuromuscular disorder associated with a variety of extra-neurological symptoms. The disease has a reported male prevalence of approximately 1:30 000 or less, but the AR repeat expansion frequency is unknown. We established a pipeline, which combines the use of the ExpansionHunter tool and visual validation, to detect AR CAG expansion on whole-genome sequencing data, benchmarked it to fragment PCR sizing, and applied it to 74 277 unrelated individuals from four large cohorts. Our pipeline showed sensitivity of 100% [95% confidence interval (CI) 90.8-100%], specificity of 99% (95% CI 94.2-99.7%), and a positive predictive value of 97.4% (95% CI 84.4-99.6%). We found the mutation frequency to be 1:3182 (95% CI 1:2309-1:4386, n = 117 734) X chromosomes-10 times more frequent than the reported disease prevalence. Modelling using the novel mutation frequency led to estimate disease prevalence of 1:6887 males, more than four times more frequent than the reported disease prevalence. This discrepancy is possibly due to underdiagnosis of this neuromuscular condition, reduced penetrance, and/or pleomorphic clinical manifestations.


Assuntos
Atrofia Muscular Espinal , Receptores Androgênicos , Humanos , Masculino , Receptores Androgênicos/genética , Atrofia Muscular Espinal/genética , Atrofia Muscular , Reação em Cadeia da Polimerase , Expansão das Repetições de Trinucleotídeos/genética
14.
medRxiv ; 2023 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-38328047

RESUMO

Background: Causal variants underlying rare disorders may remain elusive even after expansive gene panels or exome sequencing (ES). Clinicians and researchers may then turn to genome sequencing (GS), though the added value of this technique and its optimal use remain poorly defined. We therefore investigated the advantages of GS within a phenotypically diverse cohort. Methods: GS was performed for 744 individuals with rare disease who were genetically undiagnosed. Analysis included review of single nucleotide, indel, structural, and mitochondrial variants. Results: We successfully solved 218/744 (29.3%) cases using GS, with most solves involving established disease genes (157/218, 72.0%). Of all solved cases, 148 (67.9%) had previously had non-diagnostic ES. We systematically evaluated the 218 causal variants for features requiring GS to identify and 61/218 (28.0%) met these criteria, representing 8.2% of the entire cohort. These included small structural variants (13), copy neutral inversions and complex rearrangements (8), tandem repeat expansions (6), deep intronic variants (15), and coding variants that may be more easily found using GS related to uniformity of coverage (19). Conclusion: We describe the diagnostic yield of GS in a large and diverse cohort, illustrating several types of pathogenic variation eluding ES or other techniques. Our results reveal a higher diagnostic yield of GS, supporting the utility of a genome-first approach, with consideration of GS as a secondary or tertiary test when higher-resolution structural variant analysis is needed or there is a strong clinical suspicion for a condition and prior targeted genetic testing has been negative.

15.
Neurol Genet ; 8(4): e678, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35923349

RESUMO

Objectives: Recently, the number of dinucleotide CA repeats in an intron of the STMN2 gene was reported to be associated with an increased risk for amyotrophic lateral sclerosis (ALS). Therefore, we sought to replicate this observation in an independent group of ALS patients and a much larger control group. Methods: Here, we used whole-genome sequencing and tested the STMN2 CA repeat in a case-control cohort of the European genetic background and in genomes from various populations in the gnomAD cohort to attempt to replicate this proposed association. Results: We find that repeats well above the previously reported pathogenic threshold of 19 are commonly observed in unaffected individuals across different populations. Furthermore, we did not observe an association between longer STMN2 CA repeats and ALS phenotype. Discussion: In summary, our results do not support a role of STMN2 CA repeats toward ALS risk. As TDP-43 aggregation is central to ALS pathogenesis, lowered expression of STMN2 could be used as a biomarker for ALS. Therefore, a variant associated both with the risk for ALS and the level of STMN2 expression would be clinically useful. However, for a variant to be actionable, it must be strongly replicated in independent cohorts and exceed the rigorous statistical thresholds applied.

16.
Genome Med ; 14(1): 84, 2022 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-35948990

RESUMO

BACKGROUND: Expansions of short tandem repeats are the cause of many neurogenetic disorders including familial amyotrophic lateral sclerosis, Huntington disease, and many others. Multiple methods have been recently developed that can identify repeat expansions in whole genome or exome sequencing data. Despite the widely recognized need for visual assessment of variant calls in clinical settings, current computational tools lack the ability to produce such visualizations for repeat expansions. Expanded repeats are difficult to visualize because they correspond to large insertions relative to the reference genome and involve many misaligning and ambiguously aligning reads. RESULTS: We implemented REViewer, a computational method for visualization of sequencing data in genomic regions containing long repeat expansions and FlipBook, a companion image viewer designed for manual curation of large collections of REViewer images. To generate a read pileup, REViewer reconstructs local haplotype sequences and distributes reads to these haplotypes in a way that is most consistent with the fragment lengths and evenness of read coverage. To create appropriate training materials for onboarding new users, we performed a concordance study involving 12 scientists involved in short tandem repeat research. We used the results of this study to create a user guide that describes the basic principles of using REViewer as well as a guide to the typical features of read pileups that correspond to low confidence repeat genotype calls. Additionally, we demonstrated that REViewer can be used to annotate clinically relevant repeat interruptions by comparing visual assessment results of 44 FMR1 repeat alleles with the results of triplet repeat primed PCR. For 38 of these alleles, the results of visual assessment were consistent with triplet repeat primed PCR. CONCLUSIONS: Read pileup plots generated by REViewer offer an intuitive way to visualize sequencing data in regions containing long repeat expansions. Laboratories can use REViewer and FlipBook to assess the quality of repeat genotype calls as well as to visually detect interruptions or other imperfections in the repeat sequence and the surrounding flanking regions. REViewer and FlipBook are available under open-source licenses at https://github.com/illumina/REViewer and https://github.com/broadinstitute/flipbook respectively.


Assuntos
Esclerose Lateral Amiotrófica , Sequências de Repetição em Tandem , Alelos , Esclerose Lateral Amiotrófica/genética , Exoma , Proteína do X Frágil da Deficiência Intelectual/genética , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos
17.
Hum Mutat ; 43(6): 698-707, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35266241

RESUMO

Exome and genome sequencing have become the tools of choice for rare disease diagnosis, leading to large amounts of data available for analyses. To identify causal variants in these datasets, powerful filtering and decision support tools that can be efficiently used by clinicians and researchers are required. To address this need, we developed seqr - an open-source, web-based tool for family-based monogenic disease analysis that allows researchers to work collaboratively to search and annotate genomic callsets. To date, seqr is being used in several research pipelines and one clinical diagnostic lab. In our own experience through the Broad Institute Center for Mendelian Genomics, seqr has enabled analyses of over 10,000 families, supporting the diagnosis of more than 3,800 individuals with rare disease and discovery of over 300 novel disease genes. Here, we describe a framework for genomic analysis in rare disease that leverages seqr's capabilities for variant filtration, annotation, and causal variant identification, as well as support for research collaboration and data sharing. The seqr platform is available as open source software, allowing low-cost participation in rare disease research, and a community effort to support diagnosis and gene discovery in rare disease.


Assuntos
Genômica , Doenças Raras , Exoma , Humanos , Internet , Doenças Raras/diagnóstico , Doenças Raras/genética , Software
20.
Nat Commun ; 12(1): 3505, 2021 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-34108472

RESUMO

Hundreds of thousands of genetic variants have been reported to cause severe monogenic diseases, but the probability that a variant carrier develops the disease (termed penetrance) is unknown for virtually all of them. Additionally, the clinical utility of common polygenetic variation remains uncertain. Using exome sequencing from 77,184 adult individuals (38,618 multi-ancestral individuals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank, for whom genotype array data were also available), we apply clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. Rare variants causing monogenic diabetes and dyslipidemias display effect sizes significantly larger than the top 1% of the corresponding polygenic scores. Nevertheless, penetrance estimates for monogenic variant carriers average 60% or lower for most conditions. We assess epidemiologic and genetic factors contributing to risk prediction in monogenic variant carriers, demonstrating that inclusion of polygenic variation significantly improves biomarker estimation for two monogenic dyslipidemias.


Assuntos
Diabetes Mellitus Tipo 2/genética , Dislipidemias/genética , Predisposição Genética para Doença/genética , Adulto , Variação Biológica da População , Biomarcadores/metabolismo , Diabetes Mellitus Tipo 2/metabolismo , Dislipidemias/metabolismo , Exoma/genética , Genótipo , Humanos , Herança Multifatorial , Penetrância , Medição de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...