RESUMO
Genomic technologies such as next-generation sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole-genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG)exp] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG)11 short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS-affected families and identified a core ancestral haplotype, estimated to have arisen in Europe more than twenty-five thousand years ago. WGS of the four RFC1-negative CANVAS-affected families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type, and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.
Assuntos
Ataxia Cerebelar/etiologia , Biologia Computacional/métodos , Íntrons , Repetições de Microssatélites , Polineuropatias/etiologia , Proteína de Replicação C/genética , Transtornos de Sensação/etiologia , Doenças Vestibulares/etiologia , Algoritmos , Ataxia Cerebelar/patologia , Estudos de Coortes , Família , Feminino , Genômica , Humanos , Masculino , Pessoa de Meia-Idade , Polineuropatias/patologia , Transtornos de Sensação/patologia , Síndrome , Doenças Vestibulares/patologia , Sequenciamento Completo do GenomaRESUMO
A recent meta-analysis of genome-wide association screens coupled to a replication exercise in a combined US/UK collection led to the identification of 4 single nucleotide polymorphisms (SNPs) in three gene loci, i.e. TNFRSF1A, CD6 and IRF8, as novel risk factors for multiple sclerosis with genome-wide level of significance. In the present study, using a combined all-Spain collection of 2515 MS patients and 2942 healthy controls, we demonstrate significant association of rs17824933 in CD6 (P(CMH)=0.004; OR=1.14; 95% CI 1.04-1.24) and of rs1860545 in TNFRSF1A (P(CMH)=0.001; OR=1.15; 95% CI 1.06-1.25) with MS, while the low-frequency coding non-synonymous SNP rs4149584 in TNFRSF1A displayed a trend for association (P(CMH)=0.062; OR=1.27; 95% CI 0.99-1.63). This data reinforce a generic role for CD6 and TNFRSF1A in susceptibility to MS, extending to populations of southern European ancestry.