Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
J Cyst Fibros ; 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38734509

RESUMEN

BACKGROUND: Cystic fibrosis (CF) is caused by deleterious variants in each CFTR gene. We investigated the utility of whole-gene CFTR sequencing when fewer than two pathogenic or likely pathogenic (P/LP) variants were detected by conventional testing (sequencing of exons and flanking introns) of CFTR. METHODS: Individuals with features of CF and a CF-diagnostic sweat chloride concentration with zero or one P/LP variants identified by conventional testing enrolled in the CF Mutation Analysis Program (MAP) underwent whole-gene CFTR sequencing. Replication was performed on individuals enrolled in the CF Genome Project (CFGP), followed by phenotype review and interrogation of other genes. RESULTS: Whole-gene sequencing identified a second P/LP variant in 20/43 MAP enrollees (47 %) and 10/22 CFGP enrollees (45 %) who had one P/LP variant after conventional testing. No P/LP variants were detected when conventional testing was negative (MAP: n = 43; CFGP: n = 13). Genome-wide analysis was unable to find an alternative etiology in CFGP participants with fewer than two P/LP CFTR variants and CF could not be confirmed in 91 % following phenotype re-review. CONCLUSIONS: Whole-gene CFTR analysis is beneficial in individuals with one previously-identified P/LP variant and a CF-diagnostic sweat chloride. Negative conventional CFTR testing indicates that the phenotype should be re-evaluated.

2.
Am J Respir Crit Care Med ; 207(10): 1324-1333, 2023 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-36921087

RESUMEN

Rationale: Lung disease is the major cause of morbidity and mortality in persons with cystic fibrosis (pwCF). Variability in CF lung disease has substantial non-CFTR (CF transmembrane conductance regulator) genetic influence. Identification of genetic modifiers has prognostic and therapeutic importance. Objectives: Identify genetic modifier loci and genes/pathways associated with pulmonary disease severity. Methods: Whole-genome sequencing data on 4,248 unique pwCF with pancreatic insufficiency and lung function measures were combined with imputed genotypes from an additional 3,592 patients with pancreatic insufficiency from the United States, Canada, and France. This report describes association of approximately 15.9 million SNPs using the quantitative Kulich normal residual mortality-adjusted (KNoRMA) lung disease phenotype in 7,840 pwCF using premodulator lung function data. Measurements and Main Results: Testing included common and rare SNPs, transcriptome-wide association, gene-level, and pathway analyses. Pathway analyses identified novel associations with genes that have key roles in organ development, and we hypothesize that these genes may relate to dysanapsis and/or variability in lung repair. Results confirmed and extended previous genome-wide association study findings. These whole-genome sequencing data provide finely mapped genetic information to support mechanistic studies. No novel primary associations with common single variants or rare variants were found. Multilocus effects at chr5p13 (SLC9A3/CEP72) and chr11p13 (EHF/APIP) were identified. Variant effect size estimates at associated loci were consistently ordered across the cohorts, indicating possible age or birth cohort effects. Conclusions: This premodulator genomic, transcriptomic, and pathway association study of 7,840 pwCF will facilitate mechanistic and postmodulator genetic studies and the development of novel therapeutics for CF lung disease.


Asunto(s)
Fibrosis Quística , Humanos , Fibrosis Quística/genética , Estudio de Asociación del Genoma Completo/métodos , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Gravedad del Paciente , Pulmón , Proteínas Asociadas a Microtúbulos/genética
3.
Am J Hum Genet ; 109(12): 2163-2177, 2022 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-36413997

RESUMEN

Recommendations from the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) for interpreting sequence variants specify the use of computational predictors as "supporting" level of evidence for pathogenicity or benignity using criteria PP3 and BP4, respectively. However, score intervals defined by tool developers, and ACMG/AMP recommendations that require the consensus of multiple predictors, lack quantitative support. Previously, we described a probabilistic framework that quantified the strengths of evidence (supporting, moderate, strong, very strong) within ACMG/AMP recommendations. We have extended this framework to computational predictors and introduce a new standard that converts a tool's scores to PP3 and BP4 evidence strengths. Our approach is based on estimating the local positive predictive value and can calibrate any computational tool or other continuous-scale evidence on any variant type. We estimate thresholds (score intervals) corresponding to each strength of evidence for pathogenicity and benignity for thirteen missense variant interpretation tools, using carefully assembled independent data sets. Most tools achieved supporting evidence level for both pathogenic and benign classification using newly established thresholds. Multiple tools reached score thresholds justifying moderate and several reached strong evidence levels. One tool reached very strong evidence level for benign classification on some variants. Based on these findings, we provide recommendations for evidence-based revisions of the PP3 and BP4 ACMG/AMP criteria using individual tools and future assessment of computational methods for clinical interpretation.


Asunto(s)
Calibración , Humanos , Consenso , Escolaridad , Virulencia
4.
Am J Hum Genet ; 109(10): 1894-1908, 2022 10 06.
Artículo en Inglés | MEDLINE | ID: mdl-36206743

RESUMEN

Individuals with cystic fibrosis (CF) develop complications of the gastrointestinal tract influenced by genetic variants outside of CFTR. Cystic fibrosis-related diabetes (CFRD) is a distinct form of diabetes with a variable age of onset that occurs frequently in individuals with CF, while meconium ileus (MI) is a severe neonatal intestinal obstruction affecting ∼20% of newborns with CF. CFRD and MI are slightly correlated traits with previous evidence of overlap in their genetic architectures. To better understand the genetic commonality between CFRD and MI, we used whole-genome-sequencing data from the CF Genome Project to perform genome-wide association. These analyses revealed variants at 11 loci (6 not previously identified) that associated with MI and at 12 loci (5 not previously identified) that associated with CFRD. Of these, variants at SLC26A9, CEBPB, and PRSS1 associated with both traits; variants at SLC26A9 and CEBPB increased risk for both traits, while variants at PRSS1, the higher-risk alleles for CFRD, conferred lower risk for MI. Furthermore, common and rare variants within the SLC26A9 locus associated with MI only or CFRD only. As expected, different loci modify risk of CFRD and MI; however, a subset exhibit pleiotropic effects indicating etiologic and mechanistic overlap between these two otherwise distinct complications of CF.


Asunto(s)
Fibrosis Quística , Diabetes Mellitus , Enfermedades del Recién Nacido , Obstrucción Intestinal , Fibrosis Quística/complicaciones , Fibrosis Quística/genética , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Diabetes Mellitus/genética , Estudio de Asociación del Genoma Completo , Humanos , Recién Nacido , Obstrucción Intestinal/complicaciones , Obstrucción Intestinal/genética
5.
JAMA Netw Open ; 5(8): e2229158, 2022 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-36040739

RESUMEN

Importance: Polygenic risk scores (PRS) for type 2 diabetes (T2D) can improve risk prediction for gestational diabetes (GD), yet the strength of the association between genetic and lifestyle risk factors has not been quantified. Objective: To assess the association of PRS and physical activity in existing GD risk models and identify patient subgroups who may receive the most benefits from a PRS or physical activity intervention. Design, Settings, and Participants: The Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be cohort was established to study individuals without previous pregnancy lasting at least 20 weeks (nulliparous) and to elucidate factors associated with adverse pregnancy outcomes. A subcohort of 3533 participants with European ancestry was used for risk assessment and performance evaluation. Participants were enrolled from October 5, 2010, to December 3, 2013, and underwent genotyping between February 19, 2019, and February 28, 2020. Data were analyzed from September 15, 2020, to November 10, 2021. Exposures: Self-reported total physical activity in early pregnancy was quantified as metabolic equivalents of task (METs). Polygenic risk scores were calculated for T2D using contributions of 84 single nucleotide variants, weighted by their association in the Diabetes Genetics Replication and Meta-analysis Consortium data. Main Outcomes and Measures: Estimation of the development of GD from clinical, genetic, and environmental variables collected in early pregnancy, assessed using measures of model discrimination. Odds ratios and positive likelihood ratios were used to evaluate the association of PRS and physical activity with GD risk. Results: A total of 3533 women were included in this analysis (mean [SD] age, 28.6 [4.9] years). In high-risk population subgroups (body mass index ≥25 or aged ≥35 years), individuals with high PRS (top 25th percentile) or low activity levels (METs <450) had increased odds of a GD diagnosis of 25% to 75%. Compared with the general population, participants with both high PRS and low activity levels had higher odds of a GD diagnosis (odds ratio, 3.4 [95% CI, 2.3-5.3]), whereas participants with low PRS and high METs had significantly reduced risk of a GD diagnosis (odds ratio, 0.5 [95% CI, 0.3-0.9]; P = .01). Conclusions and Relevance: In this cohort study, the addition of PRS was associated with the stratified risk of GD diagnosis among high-risk patient subgroups, suggesting the benefits of targeted PRS ascertainment to encourage early intervention.


Asunto(s)
Diabetes Mellitus Tipo 2 , Diabetes Gestacional , Adulto , Estudios de Cohortes , Diabetes Mellitus Tipo 2/epidemiología , Diabetes Mellitus Tipo 2/genética , Diabetes Gestacional/epidemiología , Diabetes Gestacional/genética , Ejercicio Físico , Femenino , Predisposición Genética a la Enfermedad , Humanos , Embarazo
6.
Hum Genet ; 141(10): 1595-1613, 2022 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34549350

RESUMEN

Whole-exome and whole-genome sequencing studies in autism spectrum disorder (ASD) have identified hundreds of thousands of exonic variants. Only a handful of them, primarily loss-of-function variants, have been shown to increase the risk for ASD, while the contributory roles of other variants, including most missense variants, remain unknown. New approaches that combine tissue-specific molecular profiles with patients' genetic data can thus play an important role in elucidating the functional impact of exonic variation and improve understanding of ASD pathogenesis. Here, we integrate spatio-temporal gene co-expression networks from the developing human brain and protein-protein interaction networks to first reach accurate prioritization of ASD risk genes based on their connectivity patterns with previously known high-confidence ASD risk genes. We subsequently integrate these gene scores with variant pathogenicity predictions to further prioritize individual exonic variants based on the positive-unlabeled learning framework with gene- and variant-score calibration. We demonstrate that this approach discriminates among variants between cases and controls at the high end of the prediction range. Finally, we experimentally validate our top-scoring de novo mutation NP_001243143.1:p.Phe309Ser in the sodium/potassium-transporting ATPase ATP1A3 to disrupt protein binding with different partners.


Asunto(s)
Trastorno del Espectro Autista , Trastorno Autístico , Adenosina Trifosfatasas/genética , Adenosina Trifosfatasas/metabolismo , Trastorno del Espectro Autista/genética , Trastorno Autístico/genética , Predisposición Genética a la Enfermedad , Humanos , Mutación , Potasio/metabolismo , Sodio/metabolismo , ATPasa Intercambiadora de Sodio-Potasio/genética
7.
Nat Commun ; 11(1): 5918, 2020 11 20.
Artículo en Inglés | MEDLINE | ID: mdl-33219223

RESUMEN

Identifying pathogenic variants and underlying functional alterations is challenging. To this end, we introduce MutPred2, a tool that improves the prioritization of pathogenic amino acid substitutions over existing methods, generates molecular mechanisms potentially causative of disease, and returns interpretable pathogenicity score distributions on individual genomes. Whilst its prioritization performance is state-of-the-art, a distinguishing feature of MutPred2 is the probabilistic modeling of variant impact on specific aspects of protein structure and function that can serve to guide experimental studies of phenotype-altering variants. We demonstrate the utility of MutPred2 in the identification of the structural and functional mutational signatures relevant to Mendelian disorders and the prioritization of de novo mutations associated with complex neurodevelopmental disorders. We then experimentally validate the functional impact of several variants identified in patients with such disorders. We argue that mechanism-driven studies of human inherited disease have the potential to significantly accelerate the discovery of clinically actionable variants.


Asunto(s)
Sustitución de Aminoácidos/genética , Biología Computacional/métodos , Predisposición Genética a la Enfermedad , Programas Informáticos , Genoma Humano , Humanos , Modelos Estadísticos , Mutación , Fenotipo , Proteínas/genética
8.
JCO Clin Cancer Inform ; 4: 310-317, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32228266

RESUMEN

PURPOSE: The modern researcher is confronted with hundreds of published methods to interpret genetic variants. There are databases of genes and variants, phenotype-genotype relationships, algorithms that score and rank genes, and in silico variant effect prediction tools. Because variant prioritization is a multifactorial problem, a welcome development in the field has been the emergence of decision support frameworks, which make it easier to integrate multiple resources in an interactive environment. Current decision support frameworks are typically limited by closed proprietary architectures, access to a restricted set of tools, lack of customizability, Web dependencies that expose protected data, or limited scalability. METHODS: We present the Open Custom Ranked Analysis of Variants Toolkit1 (OpenCRAVAT) a new open-source, scalable decision support system for variant and gene prioritization. We have designed the resource catalog to be open and modular to maximize community and developer involvement, and as a result, the catalog is being actively developed and growing every month. Resources made available via the store are well suited for analysis of cancer, as well as Mendelian and complex diseases. RESULTS: OpenCRAVAT offers both command-line utility and dynamic graphical user interface, allowing users to install with a single command, easily download tools from an extensive resource catalog, create customized pipelines, and explore results in a richly detailed viewing environment. We present several case studies to illustrate the design of custom workflows to prioritize genes and variants. CONCLUSION: OpenCRAVAT is distinguished from similar tools by its capabilities to access and integrate an unprecedented amount of diverse data resources and computational prediction methods, which span germline, somatic, common, rare, coding, and noncoding variants.


Asunto(s)
Biología Computacional/organización & administración , Bases de Datos Genéticas/normas , Mutación , Proteínas de Neoplasias/genética , Neoplasias/genética , Programas Informáticos/normas , Humanos , Neoplasias/diagnóstico , Neoplasias/tratamiento farmacológico , Interfaz Usuario-Computador , Flujo de Trabajo
9.
Cancer Immunol Res ; 8(3): 396-408, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31871119

RESUMEN

Computational prediction of binding between neoantigen peptides and major histocompatibility complex (MHC) proteins can be used to predict patient response to cancer immunotherapy. Current neoantigen predictors focus on in silico estimation of MHC binding affinity and are limited by low predictive value for actual peptide presentation, inadequate support for rare MHC alleles, and poor scalability to high-throughput data sets. To address these limitations, we developed MHCnuggets, a deep neural network method that predicts peptide-MHC binding. MHCnuggets can predict binding for common or rare alleles of MHC class I or II with a single neural network architecture. Using a long short-term memory network (LSTM), MHCnuggets accepts peptides of variable length and is faster than other methods. When compared with methods that integrate binding affinity and MHC-bound peptide (HLAp) data from mass spectrometry, MHCnuggets yields a 4-fold increase in positive predictive value on independent HLAp data. We applied MHCnuggets to 26 cancer types in The Cancer Genome Atlas, processing 26.3 million allele-peptide comparisons in under 2.3 hours, yielding 101,326 unique predicted immunogenic missense mutations (IMM). Predicted IMM hotspots occurred in 38 genes, including 24 driver genes. Predicted IMM load was significantly associated with increased immune cell infiltration (P < 2 × 10-16), including CD8+ T cells. Only 0.16% of predicted IMMs were observed in more than 2 patients, with 61.7% of these derived from driver mutations. Thus, we describe a method for neoantigen prediction and its performance characteristics and demonstrate its utility in data sets representing multiple human cancers.


Asunto(s)
Antígenos de Neoplasias/inmunología , Vacunas contra el Cáncer/inmunología , Antígenos de Histocompatibilidad Clase II/inmunología , Antígenos de Histocompatibilidad Clase I/inmunología , Neoplasias/inmunología , Redes Neurales de la Computación , Algoritmos , Antígenos de Neoplasias/genética , Antígenos de Neoplasias/metabolismo , Inteligencia Artificial , Linfocitos T CD8-positivos/inmunología , Vacunas contra el Cáncer/uso terapéutico , Biología Computacional/métodos , Minería de Datos , Antígenos de Histocompatibilidad Clase I/genética , Antígenos de Histocompatibilidad Clase I/metabolismo , Antígenos de Histocompatibilidad Clase II/genética , Antígenos de Histocompatibilidad Clase II/metabolismo , Humanos , Mutación Missense , Neoplasias/tratamiento farmacológico , Neoplasias/metabolismo , Neoplasias/patología , Valor Predictivo de las Pruebas , Unión Proteica , Programas Informáticos
10.
Hum Mutat ; 40(9): 1546-1556, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31294896

RESUMEN

Testing for variation in BRCA1 and BRCA2 (commonly referred to as BRCA1/2), has emerged as a standard clinical practice and is helping countless women better understand and manage their heritable risk of breast and ovarian cancer. Yet the increased rate of BRCA1/2 testing has led to an increasing number of Variants of Uncertain Significance (VUS), and the rate of VUS discovery currently outpaces the rate of clinical variant interpretation. Computational prediction is a key component of the variant interpretation pipeline. In the CAGI5 ENIGMA Challenge, six prediction teams submitted predictions on 326 newly-interpreted variants from the ENIGMA Consortium. By evaluating these predictions against the new interpretations, we have gained a number of insights on the state of the art of variant prediction and specific steps to further advance this state of the art.


Asunto(s)
Proteína BRCA1/genética , Proteína BRCA2/genética , Neoplasias de la Mama/diagnóstico , Biología Computacional/métodos , Neoplasias Ováricas/diagnóstico , Neoplasias de la Mama/genética , Detección Precoz del Cáncer , Femenino , Predisposición Genética a la Enfermedad , Pruebas Genéticas , Variación Genética , Humanos , Modelos Genéticos , Neoplasias Ováricas/genética
11.
PLoS Comput Biol ; 15(6): e1007112, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31199787

RESUMEN

Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.


Asunto(s)
Predisposición Genética a la Enfermedad/genética , Genoma Humano , Mutación INDEL , Trastorno del Espectro Autista/genética , Trastorno del Espectro Autista/fisiopatología , Biología Computacional , Bases de Datos Genéticas , Genoma Humano/genética , Genoma Humano/fisiología , Humanos , Mutación INDEL/genética , Mutación INDEL/fisiología , Aprendizaje Automático , Curva ROC
12.
Hum Mutat ; 40(9): 1314-1320, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31140652

RESUMEN

Genetics play a key role in venous thromboembolism (VTE) risk, however established risk factors in European populations do not translate to individuals of African descent because of the differences in allele frequencies between populations. As part of the fifth iteration of the Critical Assessment of Genome Interpretation, participants were asked to predict VTE status in exome data from African American subjects. Participants were provided with 103 unlabeled exomes from patients treated with warfarin for non-VTE causes or VTE and asked to predict which disease each subject had been treated for. Given the lack of training data, many participants opted to use unsupervised machine learning methods, clustering the exomes by variation in genes known to be associated with VTE. The best performing method using only VTE related genes achieved an area under the ROC curve of 0.65. Here, we discuss the range of methods used in the prediction of VTE from sequence data and explore some of the difficulties of conducting a challenge with known confounders. In addition, we show that an existing genetic risk score for VTE that was developed in European subjects works well in African Americans.


Asunto(s)
Secuenciación del Exoma/métodos , Tromboembolia Venosa/genética , Warfarina/administración & dosificación , Análisis por Conglomerados , Biología Computacional/métodos , Congresos como Asunto , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Curva ROC , Aprendizaje Automático no Supervisado , Tromboembolia Venosa/tratamiento farmacológico , Warfarina/uso terapéutico
13.
Hum Mutat ; 40(9): 1330-1345, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31144778

RESUMEN

The Critical Assessment of Genome Interpretation-5 intellectual disability challenge asked to use computational methods to predict patient clinical phenotypes and the causal variant(s) based on an analysis of their gene panel sequence data. Sequence data for 74 genes associated with intellectual disability (ID) and/or autism spectrum disorders (ASD) from a cohort of 150 patients with a range of neurodevelopmental manifestations (i.e. ID, autism, epilepsy, microcephaly, macrocephaly, hypotonia, ataxia) have been made available for this challenge. For each patient, predictors had to report the causative variants and which of the seven phenotypes were present. Since neurodevelopmental disorders are characterized by strong comorbidity, tested individuals often present more than one pathological condition. Considering the overall clinical manifestation of each patient, the correct phenotype has been predicted by at least one group for 93 individuals (62%). ID and ASD were the best predicted among the seven phenotypic traits. Also, causative or potentially pathogenic variants were predicted correctly by at least one group. However, the prediction of the correct causative variant seems to be insufficient to predict the correct phenotype. In some cases, the correct prediction has been supported by rare or common variants in genes different from the causative one.


Asunto(s)
Trastorno del Espectro Autista/genética , Biología Computacional/métodos , Discapacidad Intelectual/genética , Análisis de Secuencia de ADN/métodos , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Fenotipo , Sitios de Carácter Cuantitativo
14.
PLoS One ; 13(12): e0208901, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30566479

RESUMEN

Recent genetic studies and whole-genome sequencing projects have greatly improved our understanding of human variation and clinically actionable genetic information. Smaller ethnic populations, however, remain underrepresented in both individual and large-scale sequencing efforts and hence present an opportunity to discover new variants of biomedical and demographic significance. This report describes the sequencing and analysis of a genome obtained from an individual of Serbian origin, introducing tens of thousands of previously unknown variants to the currently available pool. Ancestry analysis places this individual in close proximity to Central and Eastern European populations; i.e., closest to Croatian, Bulgarian and Hungarian individuals and, in terms of other Europeans, furthest from Ashkenazi Jewish, Spanish, Sicilian and Baltic individuals. Our analysis confirmed gene flow between Neanderthal and ancestral pan-European populations, with similar contributions to the Serbian genome as those observed in other European groups. Finally, to assess the burden of potentially disease-causing/clinically relevant variation in the sequenced genome, we utilized manually curated genotype-phenotype association databases and variant-effect predictors. We identified several variants that have previously been associated with severe early-onset disease that is not evident in the proband, as well as putatively impactful variants that could yet prove to be clinically relevant to the proband over the next decades. The presence of numerous private and low-frequency variants, along with the observed and predicted disease-causing mutations in this genome, exemplify some of the global challenges of genome interpretation, especially in the context of under-studied ethnic groups.


Asunto(s)
Etnicidad/genética , Predisposición Genética a la Enfermedad , Variación Genética , Genoma Humano , Animales , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Hombre de Neandertal/genética , Serbia/etnología
15.
Bioinformatics ; 33(14): i389-i398, 2017 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-28882004

RESUMEN

MOTIVATION: Loss-of-function genetic variants are frequently associated with severe clinical phenotypes, yet many are present in the genomes of healthy individuals. The available methods to assess the impact of these variants rely primarily upon evolutionary conservation with little to no consideration of the structural and functional implications for the protein. They further do not provide information to the user regarding specific molecular alterations potentially causative of disease. RESULTS: To address this, we investigate protein features underlying loss-of-function genetic variation and develop a machine learning method, MutPred-LOF, for the discrimination of pathogenic and tolerated variants that can also generate hypotheses on specific molecular events disrupted by the variant. We investigate a large set of human variants derived from the Human Gene Mutation Database, ClinVar and the Exome Aggregation Consortium. Our prediction method shows an area under the Receiver Operating Characteristic curve of 0.85 for all loss-of-function variants and 0.75 for proteins in which both pathogenic and neutral variants have been observed. We applied MutPred-LOF to a set of 1142 de novo vari3ants from neurodevelopmental disorders and find enrichment of pathogenic variants in affected individuals. Overall, our results highlight the potential of computational tools to elucidate causal mechanisms underlying loss of protein function in loss-of-function variants. AVAILABILITY AND IMPLEMENTATION: http://mutpred.mutdb.org. CONTACT: predrag@indiana.edu.


Asunto(s)
Mutación con Pérdida de Función , Aprendizaje Automático , Proteínas/genética , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Biología Computacional/métodos , Humanos , Conformación Proteica , Proteínas/metabolismo , Proteínas/fisiología
16.
Hum Mutat ; 38(9): 1182-1192, 2017 09.
Artículo en Inglés | MEDLINE | ID: mdl-28634997

RESUMEN

Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype-phenotype relationships.


Asunto(s)
Trastorno Bipolar/genética , Enfermedad de Crohn/genética , Secuenciación del Exoma/métodos , Medicina de Precisión/métodos , Warfarina/uso terapéutico , Biología Computacional/métodos , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad , Humanos , Difusión de la Información , Variantes Farmacogenómicas , Fenotipo , Warfarina/farmacología
17.
PLoS Comput Biol ; 12(8): e1005091, 2016 08.
Artículo en Inglés | MEDLINE | ID: mdl-27564311

RESUMEN

Elucidating the precise molecular events altered by disease-causing genetic variants represents a major challenge in translational bioinformatics. To this end, many studies have investigated the structural and functional impact of amino acid substitutions. Most of these studies were however limited in scope to either individual molecular functions or were concerned with functional effects (e.g. deleterious vs. neutral) without specifically considering possible molecular alterations. The recent growth of structural, molecular and genetic data presents an opportunity for more comprehensive studies to consider the structural environment of a residue of interest, to hypothesize specific molecular effects of sequence variants and to statistically associate these effects with genetic disease. In this study, we analyzed data sets of disease-causing and putatively neutral human variants mapped to protein 3D structures as part of a systematic study of the loss and gain of various types of functional attribute potentially underlying pathogenic molecular alterations. We first propose a formal model to assess probabilistically function-impacting variants. We then develop an array of structure-based functional residue predictors, evaluate their performance, and use them to quantify the impact of disease-causing amino acid substitutions on catalytic activity, metal binding, macromolecular binding, ligand binding, allosteric regulation and post-translational modifications. We show that our methodology generates actionable biological hypotheses for up to 41% of disease-causing genetic variants mapped to protein structures suggesting that it can be reliably used to guide experimental validation. Our results suggest that a significant fraction of disease-causing human variants mapping to protein structures are function-altering both in the presence and absence of stability disruption.


Asunto(s)
Secuencia de Aminoácidos/genética , Enfermedad/genética , Modelos Estadísticos , Mutación/genética , Sustitución de Aminoácidos/genética , Biología Computacional , Simulación por Computador , Humanos , Modelos Moleculares , Unión Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...