RESUMEN
MOTIVATION: In the modern era of genomic research, the scientific community is witnessing an explosive growth in the volume of published findings. While this abundance of data offers invaluable insights, it also places a pressing responsibility on genetic professionals and researchers to stay informed about the latest findings and their clinical significance. Genomic variant interpretation is currently facing a challenge in identifying the most up-to-date and relevant scientific papers, while also extracting meaningful information to accelerate the process from clinical assessment to reporting. Computer-aided literature search and summarization can play a pivotal role in this context. By synthesizing complex genomic findings into concise, interpretable summaries, this approach facilitates the translation of extensive genomic datasets into clinically relevant insights. RESULTS: To bridge this gap, we present VarChat (varchat.engenome.com), an innovative tool based on generative AI, developed to find and summarize the fragmented scientific literature associated with genomic variants into brief yet informative texts. VarChat provides users with a concise description of specific genetic variants, detailing their impact on related proteins and possible effects on human health. In addition, VarChat offers direct links to related scientific trustable sources, and encourages deeper research. AVAILABILITY AND IMPLEMENTATION: varchat.engenome.com.
Asunto(s)
Variación Genética , Genoma Humano , Genómica , Humanos , Genómica/métodos , Programas Informáticos , Inteligencia Artificial , Bases de Datos GenéticasRESUMEN
BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.
Asunto(s)
Enfermedades Raras , Humanos , Enfermedades Raras/genética , Enfermedades Raras/diagnóstico , Genoma Humano/genética , Variación Genética/genética , Biología Computacional/métodos , FenotipoRESUMEN
The Critical Assessment of Genome Interpretation-5 intellectual disability challenge asked to use computational methods to predict patient clinical phenotypes and the causal variant(s) based on an analysis of their gene panel sequence data. Sequence data for 74 genes associated with intellectual disability (ID) and/or autism spectrum disorders (ASD) from a cohort of 150 patients with a range of neurodevelopmental manifestations (i.e. ID, autism, epilepsy, microcephaly, macrocephaly, hypotonia, ataxia) have been made available for this challenge. For each patient, predictors had to report the causative variants and which of the seven phenotypes were present. Since neurodevelopmental disorders are characterized by strong comorbidity, tested individuals often present more than one pathological condition. Considering the overall clinical manifestation of each patient, the correct phenotype has been predicted by at least one group for 93 individuals (62%). ID and ASD were the best predicted among the seven phenotypic traits. Also, causative or potentially pathogenic variants were predicted correctly by at least one group. However, the prediction of the correct causative variant seems to be insufficient to predict the correct phenotype. In some cases, the correct prediction has been supported by rare or common variants in genes different from the causative one.
Asunto(s)
Trastorno del Espectro Autista/genética , Biología Computacional/métodos , Discapacidad Intelectual/genética , Análisis de Secuencia de ADN/métodos , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Fenotipo , Sitios de Carácter CuantitativoRESUMEN
Variant interpretation for the diagnosis of genetic diseases is a complex process. The American College of Medical Genetics and Genomics, with the Association for Molecular Pathology, have proposed a set of evidence-based guidelines to support variant pathogenicity assessment and reporting in Mendelian diseases. Cardiovascular disorders are a field of application of these guidelines, but practical implementation is challenging due to the genetic disease heterogeneity and the complexity of information sources that need to be integrated. Decision support systems able to automate variant interpretation in the light of specific disease domains are demanded. We implemented CardioVAI (Cardio Variant Interpreter), an automated system for guidelines based variant classification in cardiovascular-related genes. Different omics-resources were integrated to assess pathogenicity of every genomic variant in 72 cardiovascular diseases related genes. We validated our method on benchmark datasets of high-confident assessed variants, reaching pathogenicity and benignity concordance up to 83 and 97.08%, respectively. We compared CardioVAI to similar methods and analyzed the main differences in terms of guidelines implementation. We finally made available CardioVAI as a web resource (http://cardiovai.engenome.com/) that allows users to further specialize guidelines recommendations.
Asunto(s)
Enfermedades Cardiovasculares/genética , Variación Genética , Sociedades Médicas/organización & administración , Práctica Clínica Basada en la Evidencia , Pruebas Genéticas , Humanos , Guías de Práctica Clínica como Asunto , Programas InformáticosRESUMEN
BACKGROUND: High throughput sequencing technologies are able to identify the whole genomic variation of an individual. Gene-targeted and whole-exome experiments are mainly focused on coding sequence variants related to a single or multiple nucleotides. The analysis of the biological significance of this multitude of genomic variant is challenging and computational demanding. RESULTS: We present PaPI, a new machine-learning approach to classify and score human coding variants by estimating the probability to damage their protein-related function. The novelty of this approach consists in using pseudo amino acid composition through which wild and mutated protein sequences are represented in a discrete model. A machine learning classifier has been trained on a set of known deleterious and benign coding variants with the aim to score unobserved variants by taking into account hidden sequence patterns in human genome potentially leading to diseases. We show how the combination of amphiphilic pseudo amino acid composition, evolutionary conservation and homologous proteins based methods outperforms several prediction algorithms and it is also able to score complex variants such as deletions, insertions and indels. CONCLUSIONS: This paper describes a machine-learning approach to predict the deleteriousness of human coding variants. A freely available web application (http://papi.unipv.it) has been developed with the presented method, able to score up to thousands variants in a single run.
Asunto(s)
Algoritmos , Aminoácidos/genética , Inteligencia Artificial , Variación Genética/genética , Genoma Humano , Proteínas/genética , Programas Informáticos , Exoma/genética , Secuenciación de Nucleótidos de Alto Rendimiento , HumanosRESUMEN
BACKGROUND: Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human phenotypes. The i2b2 Center (Informatics for Integrating Biology and the Bedside) has developed a widely internationally adopted framework to use existing clinical data for discovery research that can help the definition of precision medicine interventions when coupled with genetic data. i2b2 can be significantly advanced by designing efficient management solutions of Next Generation Sequencing data. RESULTS: We developed BigQ, an extension of the i2b2 framework, which integrates patient clinical phenotypes with genomic variant profiles generated by Next Generation Sequencing. A visual programming i2b2 plugin allows retrieving variants belonging to the patients in a cohort by applying filters on genomic variant annotations. We report an evaluation of the query performance of our system on more than 11 million variants, showing that the implemented solution scales linearly in terms of query time and disk space with the number of variants. CONCLUSIONS: In this paper we describe a new i2b2 web service composed of an efficient and scalable document-based database that manages annotations of genomic variants and of a visual programming plug-in designed to dynamically perform queries on clinical and genetic data. The system therefore allows managing the fast growing volume of genomic variants and can be used to integrate heterogeneous genomic annotations.
Asunto(s)
Genómica , Programas Informáticos , Bases de Datos Factuales , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Almacenamiento y Recuperación de la InformaciónRESUMEN
The diagnosis of VACTERL syndrome can be elusive, especially in the prenatal life, due to the presence of malformations that overlap those present in other genetic conditions, including the Fanconi anemia (FA). We report on three VACTERL cases within two families, where the two who arrived to be born died shortly after birth due to severe organs' malformations. The suspicion of VACTERL association was based on prenatal ultrasound assessment and postnatal features. Subsequent chromosome breakage analysis suggested the diagnosis of FA. Finally, by next-generation sequencing based on the analysis of the exome in one family and of a panel of Fanconi genes in the second one, we identified novel FANCL truncating mutations in both families. We used ectopic expression of wild-type FANCL to functionally correct the cellular FA phenotype for both mutations. Our study emphasizes that the diagnosis of FA should be considered when VACTERL association is suspected. Furthermore, we show that loss-of-function mutations in FANCL result in a severe clinical phenotype characterized by early postnatal death.
Asunto(s)
Canal Anal/anomalías , Esófago/anomalías , Proteína del Grupo de Complementación L de la Anemia de Fanconi/genética , Anemia de Fanconi/diagnóstico , Anemia de Fanconi/genética , Cardiopatías Congénitas/diagnóstico , Cardiopatías Congénitas/genética , Riñón/anomalías , Deformidades Congénitas de las Extremidades/diagnóstico , Deformidades Congénitas de las Extremidades/genética , Mutación , Fenotipo , Columna Vertebral/anomalías , Tráquea/anomalías , Aborto Inducido , Rotura Cromosómica , Diagnóstico Diferencial , Exoma , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Recién Nacido , Nacimiento Vivo , Masculino , Embarazo , Diagnóstico Prenatal , Índice de Severidad de la EnfermedadRESUMEN
Autosomal dominant dehydrated hereditary stomatocytosis (DHSt) usually presents as a compensated hemolytic anemia with macrocytosis and abnormally shaped red blood cells (RBCs). DHSt is part of a pleiotropic syndrome that may also exhibit pseudohyperkalemia and perinatal edema. We identified PIEZO1 as the disease gene for pleiotropic DHSt in a large kindred by exome sequencing analysis within the previously mapped 16q23-q24 interval. In 26 affected individuals among 7 multigenerational DHSt families with the pleiotropic syndrome, 11 heterozygous PIEZO1 missense mutations cosegregated with disease. PIEZO1 is expressed in the plasma membranes of RBCs and its messenger RNA, and protein levels increase during in vitro erythroid differentiation of CD34(+) cells. PIEZO1 is also expressed in liver and bone marrow during human and mouse development. We suggest for the first time a correlation between a PIEZO1 mutation and perinatal edema. DHSt patient red cells with the R2456H mutation exhibit increased ion-channel activity. Functional studies of PIEZO1 mutant R2488Q expressed in Xenopus oocytes demonstrated changes in ion-channel activity consistent with the altered cation content of DHSt patient red cells. Our findings provide direct evidence that R2456H and R2488Q mutations in PIEZO1 alter mechanosensitive channel regulation, leading to increased cation transport in erythroid cells.
Asunto(s)
Anemia Hemolítica Congénita/genética , Hidropesía Fetal/genética , Canales Iónicos/genética , Mutación , Adulto , Secuencia de Aminoácidos , Anemia Hemolítica Congénita/clasificación , Anemia Hemolítica Congénita/diagnóstico , Animales , Embrión de Mamíferos , Femenino , Regulación del Desarrollo de la Expresión Génica , Humanos , Hidropesía Fetal/clasificación , Hidropesía Fetal/diagnóstico , Ratones , Ratones Transgénicos , Modelos Biológicos , Datos de Secuencia Molecular , Mutación/fisiología , Linaje , Homología de Secuencia de Aminoácido , Transfección , Xenopus laevisRESUMEN
COL4A1 is located in humans on chromosome13q34 and it encodes the alpha 1 chain of type IV collagen, a component of basal membrane. It is expressed mainly in the brain, muscles, kidneys and eyes. Different COL4A1 mutations have been reported in many patients who present a very wide spectrum of clinical symptoms. They typically show a multisystemic phenotype. Here we report on the case of a patient carrying a novel de novo splicing mutation of COL4A1 associated with a distinctive clinical picture characterized by onset in infancy and an unusual evolution of the neuroradiological features. At three months of age, the child was diagnosed with a congenital cataract, while his brain MRI was normal. Over the following years, the patient developed focal epilepsy, mild diplegia, asymptomatic microhematuria, raised creatine kinase levels, MRI white matter abnormalities and brain calcification on CT. During the neuroradiological follow-up the extension and intensity of the brain lesions progressively decreased. The significance of a second variant in COL4A1 carried by the child and inherited from his father remains to be clarified. In conclusion, our patient shows new aspects of this collagenopathy and possibly a COL4A1 compound heterozygosity.
Asunto(s)
Anomalías Múltiples/diagnóstico por imagen , Parálisis Cerebral/diagnóstico , Colágeno Tipo IV/genética , Anomalías Múltiples/genética , Secuencia de Bases , Parálisis Cerebral/genética , Niño , Análisis Mutacional de ADN , Estudios de Asociación Genética , Humanos , Masculino , Mutación , Radiografía , Sustancia Blanca/anomalías , Sustancia Blanca/diagnóstico por imagenRESUMEN
Genotyping Next Generation Sequencing (NGS) data of a diploid genome aims to assign the zygosity of identified variants through comparison with a reference genome. Current methods typically employ probabilistic models that rely on the pileup of bases at each locus and on a priori knowledge. We present a new algorithm, called Kimimila (KInetic Modeling based on InforMation theory to Infer Labels of Alleles), which is able to assign reads to alleles by using a distance geometry approach and to infer the variant genotypes accurately, without any kind of assumption. The performance of the model has been assessed on simulated and real data of the 1000 Genomes Project and the results have been compared with several commonly used genotyping methods, i.e., GATK, Samtools, VarScan, FreeBayes and Atlas2. Despite our algorithm does not make use of a priori knowledge, the percentage of correctly genotyped variants is comparable to these algorithms. Furthermore, our method allows the user to split the reads pool depending on the inferred allele origin.
Asunto(s)
Alelos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Algoritmos , Teorema de Bayes , Análisis por Conglomerados , Biología Computacional , Simulación por Computador , Recolección de Datos/métodos , Variación Genética , Genoma , Genómica , Genotipo , Haplotipos , Humanos , Cinética , Modelos Estadísticos , Polimorfismo de Nucleótido Simple , Programas InformáticosRESUMEN
BACKGROUND: Sudden death is the leading cause of mortality in medically refractory epilepsy. Middle-aged persons with epilepsy (PWE) are under investigated regarding their mortality risk and burden of cardiovascular disease (CVD). METHODS: Using UK Biobank, we identified 7786 (1.6%) participants with diagnoses of epilepsy and 6,171,803 person-years of follow-up (mean 12.30 years, standard deviation 1.74); 566 patients with previous histories of stroke were excluded. The 7220 PWE comprised the study cohort with the remaining 494,676 without epilepsy as the comparator group. Prevalence of CVD was determined using validated diagnostic codes. Cox proportional hazards regression was used to assess all-cause mortality and sudden death risk. RESULTS: Hypertension, coronary artery disease, heart failure, valvular heart disease, and congenital heart disease were more prevalent in PWE. Arrhythmias including atrial fibrillation/flutter (12.2% vs 6.9%; P < 0.01), bradyarrhythmias (7.7% vs 3.5%; P < 0.01), conduction defects (6.1% vs 2.6%; P < 0.01), and ventricular arrhythmias (2.3% vs 1.0%; P < 0.01), as well as cardiac implantable electric devices (4.6% vs 2.0%; P < 0.01) were more prevalent in PWE. PWE had higher adjusted all-cause mortality (hazard ratio [HR], 3.9; 95% confidence interval [CI], 3.01-3.39), and sudden death-specific mortality (HR, 6.65; 95% CI, 4.53-9.77); and were almost 2 years younger at death (68.1 vs 69.8; P < 0.001). CONCLUSIONS: Middle-aged PWE have increased all-cause and sudden death-specific mortality and higher burden of CVD including arrhythmias and heart failure. Further work is required to elucidate mechanisms underlying all-cause mortality and sudden death risk in PWE of middle age, to identify prognostic biomarkers and develop preventative therapies in PWE.
Asunto(s)
Enfermedades Cardiovasculares , Epilepsia , Insuficiencia Cardíaca , Persona de Mediana Edad , Humanos , Enfermedades Cardiovasculares/epidemiología , Biobanco del Reino Unido , Bancos de Muestras Biológicas , Factores de Riesgo , Epilepsia/complicaciones , Epilepsia/epidemiología , Muerte Súbita/epidemiología , Muerte Súbita/etiología , Muerte Súbita Cardíaca/epidemiología , Muerte Súbita Cardíaca/etiologíaRESUMEN
Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds. Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency. Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.
RESUMEN
Genomic variant interpretation is a critical step of the diagnostic procedure, often supported by the application of tools that may predict the damaging impact of each variant or provide a guidelines-based classification. We propose the application of Machine Learning methodologies, in particular Penalized Logistic Regression, to support variant classification and prioritization. Our approach combines ACMG/AMP guidelines for germline variant interpretation as well as variant annotation features and provides a probabilistic score of pathogenicity, thus supporting the prioritization and classification of variants that would be interpreted as uncertain by the ACMG/AMP guidelines. We compared different approaches in terms of variant prioritization and classification on different datasets, showing that our data-driven approach is able to solve more variant of uncertain significance (VUS) cases in comparison with guidelines-based approaches and in silico prediction tools.
Asunto(s)
Predisposición Genética a la Enfermedad , Variación Genética , Aprendizaje Automático , Neoplasias/genética , Guías de Práctica Clínica como Asunto , Teorema de Bayes , Estudios de Cohortes , Simulación por Computador , Pruebas Genéticas/métodos , Genoma Humano , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Modelos Logísticos , Neoplasias/diagnóstico , Proyectos de Investigación , Programas InformáticosRESUMEN
Introduction. Shwachman-Diamond Syndrome (SDS) is an autosomal-recessive disorder characterized by neutropenia, pancreatic exocrine insufficiency, skeletal dysplasia, and an increased risk for leukemic transformation. Biallelic mutations in the SBDS gene have been found in about 90% of patients. The clinical spectrum of SDS in patients is wide, and variability has been noticed between different patients, siblings, and even within the same patient over time. Herein, we present two SDS siblings (UPN42 and UPN43) carrying the same SBDS mutations and showing relevant differences in their phenotypic presentation. Study aim. We attempted to understand whether other germline variants, in addition to SBDS, could explain some of the clinical variability noticed between the siblings. Methods. Whole-exome sequencing (WES) was performed. Human Phenotype Ontology (HPO) terms were defined for each patient, and the WES data were analyzed using the eVai and DIVAs platforms. Results. In UPN43, we found and confirmed, using Sanger sequencing, a novel de novo variant (c.10663G > A, p.Gly3555Ser) in the KMT2A gene that is associated with autosomal-dominant Wiedemann−Steiner Syndrome. The variant is classified as pathogenic according to different in silico prediction tools. Interestingly, it was found to be related to some of the HPO terms that describe UPN43. Conclusions. We postulate that the KMT2A variant found in UPN43 has a concomitant and co-occurring clinical effect, in addition to SBDS mutation. This dual molecular effect, supported by in silico prediction, could help to understand some of the clinical variations found among the siblings. In the future, these new data are likely to be useful for personalized medicine and therapy for selected cases.
Asunto(s)
Enfermedades de la Médula Ósea , Insuficiencia Pancreática Exocrina , N-Metiltransferasa de Histona-Lisina , Proteína de la Leucemia Mieloide-Linfoide , Síndrome de Shwachman-Diamond , Variación Biológica Poblacional , Enfermedades de la Médula Ósea/genética , Insuficiencia Pancreática Exocrina/genética , N-Metiltransferasa de Histona-Lisina/genética , Humanos , Proteína de la Leucemia Mieloide-Linfoide/genética , Síndrome de Shwachman-Diamond/genética , HermanosRESUMEN
Meier-Gorlin syndrome (MGORS) is a rare disorder characterized by primordial dwarfism, microtia, and patellar aplasia/hypoplasia. Recessive mutations in ORC1, ORC4, ORC6, CDT1, CDC6, and CDC45, encoding members of the pre-replication (pre-RC) and pre-initiation (pre-IC) complexes, and heterozygous mutations in GMNN, a regulator of cell-cycle progression and DNA replication, have already been associated with this condition. We performed whole-exome sequencing (WES) in a patient with a clinical diagnosis of MGORS and identified biallelic variants in MCM5. This gene encodes a subunit of the replicative helicase complex, which represents a component of the pre-RC. Both variants, a missense substitution within a conserved domain critical for the helicase activity, and a single base deletion causing a frameshift and a premature stop codon, were predicted to be detrimental for the MCM5 function. Although variants of MCM5 have never been reported in specific human diseases, defect of this gene in zebrafish causes a phenotype of growth restriction overlapping the one associated with orc1 depletion. Complementation experiments in yeast showed that the plasmid carrying the missense variant was unable to rescue the lethal phenotype caused by mcm5 deletion. Moreover cell-cycle progression was delayed in patient's cells, as already shown for mutations in the ORC1 gene. Altogether our findings support the role of MCM5 as a novel gene involved in MGORS, further emphasizing that this condition is caused by impaired DNA replication.
Asunto(s)
Proteínas de Ciclo Celular/genética , Microtia Congénita/genética , Trastornos del Crecimiento/genética , Micrognatismo/genética , Rótula/anomalías , Proteínas de Ciclo Celular/metabolismo , Células Cultivadas , Niño , Codón sin Sentido , Microtia Congénita/diagnóstico , Replicación del ADN , Exoma , Prueba de Complementación Genética , Trastornos del Crecimiento/diagnóstico , Humanos , Mutación INDEL , Masculino , Micrognatismo/diagnóstico , Mutación Missense , Saccharomyces cerevisiae/genéticaRESUMEN
Among the scientific challenges posed by complex diseases with a strong genetic component, two stand out. One is unveiling the role of rare and common genetic variants; the other is the design of classification models to improve clinical diagnosis and predictive models for prognosis and personalized therapies. In this paper, we present a data fusion framework merging gene, domain, pathway and protein-protein interaction data related to a next generation sequencing epilepsy gene panel. Our method allows integrating association information from multiple genomic sources and aims at highlighting the set of common and rare variants that are capable to trigger the occurrence of a complex disease. When compared to other approaches, our method shows better performances in classifying patients affected by epilepsy.
Asunto(s)
Biología Computacional/métodos , Epilepsia/genética , Estudios de Asociación Genética/métodos , Algoritmos , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mapas de Interacción de ProteínasRESUMEN
PURPOSE: The genetic basis of myelodysplastic syndromes (MDS) is heterogeneous, and various combinations of somatic mutations are associated with different clinical phenotypes and outcomes. Whether the genetic basis of MDS influences the outcome of allogeneic hematopoietic stem-cell transplantation (HSCT) is unclear. PATIENTS AND METHODS: We studied 401 patients with MDS or acute myeloid leukemia (AML) evolving from MDS (MDS/AML). We used massively parallel sequencing to examine tumor samples collected before HSCT for somatic mutations in 34 recurrently mutated genes in myeloid neoplasms. We then analyzed the impact of mutations on the outcome of HSCT. RESULTS: Overall, 87% of patients carried one or more oncogenic mutations. Somatic mutations of ASXL1, RUNX1, and TP53 were independent predictors of relapse and overall survival after HSCT in both patients with MDS and patients with MDS/AML (P values ranging from .003 to .035). In patients with MDS/AML, gene ontology (ie, secondary-type AML carrying mutations in genes of RNA splicing machinery, TP53-mutated AML, or de novo AML) was an independent predictor of posttransplantation outcome (P = .013). The impact of ASXL1, RUNX1, and TP53 mutations on posttransplantation survival was independent of the revised International Prognostic Scoring System (IPSS-R). Combining somatic mutations and IPSS-R risk improved the ability to stratify patients by capturing more prognostic information at an individual level. Accounting for various combinations of IPSS-R risk and somatic mutations, the 5-year probability of survival after HSCT ranged from 0% to 73%. CONCLUSION: Somatic mutation in ASXL1, RUNX1, or TP53 is independently associated with unfavorable outcomes and shorter survival after allogeneic HSCT for patients with MDS and MDS/AML. Accounting for these genetic lesions may improve the prognostication precision in clinical practice and in designing clinical trials.
RESUMEN
We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8-10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1-2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on 'benchtop' sequencers combining rapid turnaround times with higher manageability.
Asunto(s)
Epilepsia/diagnóstico , Epilepsia/genética , Pruebas Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Adolescente , Adulto , Estudios de Casos y Controles , Niño , Preescolar , Biología Computacional , Femenino , Estudios de Seguimiento , Estudios de Asociación Genética , Humanos , Lactante , Masculino , Mutación , Flujo de Trabajo , Adulto JovenRESUMEN
Coenzyme Q10 deficiency is a clinically and genetically heterogeneous disorder, with manifestations that may range from fatal neonatal multisystem failure, to adult-onset encephalopathy. We report a patient who presented at birth with severe lactic acidosis, proteinuria, dicarboxylic aciduria, and hepatic insufficiency. She also had dilation of left ventricle on echocardiography. Her neurological condition rapidly worsened and despite aggressive care she died at 23 h of life. Muscle histology displayed lipid accumulation. Electron microscopy showed markedly swollen mitochondria with fragmented cristae. Respiratory-chain enzymatic assays showed a reduction of combined activities of complex I+III and II+III with normal activities of isolated complexes. The defect was confirmed in fibroblasts, where it could be rescued by supplementing the culture medium with 10 µM coenzyme Q10. Coenzyme Q10 levels were reduced (28% of controls) in these cells. We performed exome sequencing and focused the analysis on genes involved in coenzyme Q10 biosynthesis. The patient harbored a homozygous c.545T>G, p.(Met182Arg) alteration in COQ2, which was validated by functional complementation in yeast. In this case the biochemical and morphological features were essential to direct the genetic diagnosis. The parents had another pregnancy after the biochemical diagnosis was established, but before the identification of the genetic defect. Because of the potentially high recurrence risk, and given the importance of early CoQ10 supplementation, we decided to treat with CoQ10 the newborn child pending the results of the biochemical assays. Clinicians should consider a similar management in siblings of patients with CoQ10 deficiency without a genetic diagnosis.
Asunto(s)
Transferasas Alquil y Aril/genética , Ataxia/diagnóstico , Ataxia/genética , Mitocondrias Musculares/genética , Enfermedades Mitocondriales/diagnóstico , Enfermedades Mitocondriales/genética , Debilidad Muscular/diagnóstico , Debilidad Muscular/genética , Mutación Puntual , Ubiquinona/análogos & derivados , Ubiquinona/deficiencia , Acidosis Láctica/sangre , Acidosis Láctica/genética , Acidosis Láctica/patología , Transferasas Alquil y Aril/deficiencia , Ataxia/sangre , Ataxia/patología , Consanguinidad , Resultado Fatal , Femenino , Expresión Génica , Insuficiencia Hepática/sangre , Insuficiencia Hepática/genética , Insuficiencia Hepática/patología , Humanos , Recién Nacido , Discapacidad Intelectual/sangre , Discapacidad Intelectual/genética , Discapacidad Intelectual/patología , Mitocondrias Musculares/enzimología , Mitocondrias Musculares/patología , Enfermedades Mitocondriales/sangre , Enfermedades Mitocondriales/patología , Debilidad Muscular/sangre , Debilidad Muscular/patología , Músculo Esquelético/enzimología , Músculo Esquelético/patología , Proteinuria/sangre , Proteinuria/genética , Proteinuria/patología , Aminoacidurias Renales/sangre , Aminoacidurias Renales/genética , Aminoacidurias Renales/patología , Análisis de Secuencia de ADN , Ubiquinona/sangre , Ubiquinona/genéticaRESUMEN
OBJECTIVE: To investigate the molecular defect underlying a large Italian kindred with progressive adult-onset respiratory failure, proximal weakness of the upper limbs, and evidence of lower motor neuron degeneration. METHODS: We describe the clinical features of 5 patients presenting with prominent respiratory insufficiency, proximal weakness of the upper limbs, and no signs of frontotemporal lobar degeneration or semantic dementia. Molecular analysis was performed combining linkage and exome sequencing analyses. Further investigations included transcript analysis and immunocytochemical and protein studies on established cell models. RESULTS: Genome-wide linkage analysis showed an association with chromosome 17q21. Exome analysis disclosed a missense change in MAPT segregating dominantly with the disease and resulting in D348G-mutated tau protein. Motor neuron cell lines overexpressing mutated D348G tau isoforms displayed a consistent reduction in neurite length and arborization. The mutation does not seem to modify tau interactions with microtubules. Neuropathologic studies were performed in one affected subject, which exhibited α-motoneuron loss and atrophy of the spinal anterior horns with accumulation of phosphorylated tau within the surviving motor neurons. Staining for 3R- and 4R-tau revealed pathology similar to that observed in familial cases harboring MAPT mutations. CONCLUSION: Our study broadens the phenotype of tauopathies to include lower motor neuron disease and implicate tau degradation pathway defects in motor neuron degeneration.