Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 139
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Am J Hum Genet ; 109(2): 195-209, 2022 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-35032432

RESUMEN

Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.


Asunto(s)
Algoritmos , Genoma Humano , Variación Estructural del Genoma , Programas Informáticos , Aprendizaje Automático Supervisado , Conjuntos de Datos como Asunto , Exones , Genómica/métodos , Humanos , Curva ROC , Secuenciación Completa del Genoma/estadística & datos numéricos
2.
Am J Hum Genet ; 109(12): 2163-2177, 2022 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-36413997

RESUMEN

Recommendations from the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) for interpreting sequence variants specify the use of computational predictors as "supporting" level of evidence for pathogenicity or benignity using criteria PP3 and BP4, respectively. However, score intervals defined by tool developers, and ACMG/AMP recommendations that require the consensus of multiple predictors, lack quantitative support. Previously, we described a probabilistic framework that quantified the strengths of evidence (supporting, moderate, strong, very strong) within ACMG/AMP recommendations. We have extended this framework to computational predictors and introduce a new standard that converts a tool's scores to PP3 and BP4 evidence strengths. Our approach is based on estimating the local positive predictive value and can calibrate any computational tool or other continuous-scale evidence on any variant type. We estimate thresholds (score intervals) corresponding to each strength of evidence for pathogenicity and benignity for thirteen missense variant interpretation tools, using carefully assembled independent data sets. Most tools achieved supporting evidence level for both pathogenic and benign classification using newly established thresholds. Multiple tools reached score thresholds justifying moderate and several reached strong evidence levels. One tool reached very strong evidence level for benign classification on some variants. Based on these findings, we provide recommendations for evidence-based revisions of the PP3 and BP4 ACMG/AMP criteria using individual tools and future assessment of computational methods for clinical interpretation.


Asunto(s)
Calibración , Humanos , Consenso , Escolaridad , Virulencia
3.
Am J Hum Genet ; 108(4): 535-548, 2021 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-33798442

RESUMEN

Genome sequencing is enabling precision medicine-tailoring treatment to the unique constellation of variants in an individual's genome. The impact of recurrent pathogenic variants is often understood, however there is a long tail of rare genetic variants that are uncharacterized. The problem of uncharacterized rare variation is especially acute when it occurs in genes of known clinical importance with functionally consequential variants and associated mechanisms. Variants of uncertain significance (VUSs) in these genes are discovered at a rate that outpaces current ability to classify them with databases of previous cases, experimental evaluation, and computational predictors. Clinicians are thus left without guidance about the significance of variants that may have actionable consequences. Computational prediction of the impact of rare genetic variation is increasingly becoming an important capability. In this paper, we review the technical and ethical challenges of interpreting the function of rare variants in two settings: inborn errors of metabolism in newborns and pharmacogenomics. We propose a framework for a genomic learning healthcare system with an initial focus on early-onset treatable disease in newborns and actionable pharmacogenomics. We argue that (1) a genomic learning healthcare system must allow for continuous collection and assessment of rare variants, (2) emerging machine learning methods will enable algorithms to predict the clinical impact of rare variants on protein function, and (3) ethical considerations must inform the construction and deployment of all rare-variation triage strategies, particularly with respect to health disparities arising from unbalanced ancestry representation.


Asunto(s)
Variación Genética/genética , Genética Médica , Genómica , Aprendizaje Automático , Errores Innatos del Metabolismo/genética , Farmacogenética , Medicina de Precisión , Genoma Humano/genética , Humanos , Recién Nacido
4.
Nucleic Acids Res ; 50(D1): D553-D559, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34850923

RESUMEN

The Structural Classification of Proteins-extended (SCOPe, https://scop.berkeley.edu) knowledgebase aims to provide an accurate, detailed, and comprehensive description of the structural and evolutionary relationships amongst the majority of proteins of known structure, along with resources for analyzing the protein structures and their sequences. Structures from the PDB are divided into domains and classified using a combination of manual curation and highly precise automated methods. In the current release of SCOPe, 2.08, we have developed search and display tools for analysis of genetic variants we mapped to structures classified in SCOPe. In order to improve the utility of SCOPe to automated methods such as deep learning classifiers that rely on multiple alignment of sequences of homologous proteins, we have introduced new machine-parseable annotations that indicate aberrant structures as well as domains that are distinguished by a smaller repeat unit. We also classified structures from 74 of the largest Pfam families not previously classified in SCOPe, and we improved our algorithm to remove N- and C-terminal cloning, expression and purification sequences from SCOPe domains. SCOPe 2.08-stable classifies 106 976 PDB entries (about 60% of PDB entries).


Asunto(s)
Biología Computacional , Bases de Datos de Proteínas , Proteínas/clasificación , Algoritmos , Bases de Datos de Compuestos Químicos , Regulación de la Expresión Génica/genética , Aprendizaje Automático , Proteínas/genética
5.
New Phytol ; 239(1): 222-239, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-36631975

RESUMEN

To infect plants, pathogenic fungi secrete small proteins called effectors. Here, we describe the catalytic activity and potential virulence function of the Nudix hydrolase effector AvrM14 from the flax rust fungus (Melampsora lini). We completed extensive in vitro assays to characterise the enzymatic activity of the AvrM14 effector. Additionally, we used in planta transient expression of wild-type and catalytically dead AvrM14 versions followed by biochemical assays, phenotypic analysis and RNA sequencing to unravel how the catalytic activity of AvrM14 impacts plant immunity. AvrM14 is an extremely selective enzyme capable of removing the protective 5' cap from mRNA transcripts in vitro. Homodimerisation of AvrM14 promoted biologically relevant mRNA cap cleavage in vitro and this activity was conserved in related effectors from other Melampsora spp. In planta expression of wild-type AvrM14, but not the catalytically dead version, suppressed immune-related reactive oxygen species production, altered the abundance of some circadian-rhythm-associated mRNA transcripts and reduced the hypersensitive cell-death response triggered by the flax disease resistance protein M1. To date, the decapping of host mRNA as a virulence strategy has not been described beyond viruses. Our results indicate that some fungal pathogens produce Nudix hydrolase effectors with in vitro mRNA-decapping activity capable of interfering with plant immunity.


Asunto(s)
Basidiomycota , ARN Mensajero/genética , ARN Mensajero/metabolismo , Basidiomycota/genética , Hongos/genética , Pirofosfatasas/metabolismo , Virulencia/genética , Enfermedades de las Plantas/microbiología , Hidrolasas Nudix
6.
Am J Med Genet C Semin Med Genet ; 190(2): 222-230, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35838066

RESUMEN

In the US, newborn screening (NBS) is a unique health program that supports health equity and screens virtually every baby after birth, and has brought timely treatments to babies since the 1960's. With the decreasing cost of sequencing and the improving methods to interpret genetic data, there is an opportunity to add DNA sequencing as a screening method to facilitate the identification of babies with treatable conditions that cannot be identified in any other scalable way, including highly penetrant genetic neurodevelopmental disorders (NDD). However, the lack of effective dietary or drug-based treatments has made it nearly impossible to consider NDDs in the current NBS framework, yet it is anticipated that any treatment will be maximally effective if started early. Hence there is a critical need for large scale pilot studies to assess if and how NDDs can be effectively screened at birth, if parents desire that information, and what impact early diagnosis may have. Here we attempt to provide an overview of the recent advances in NDD treatments, explore the possible framework of setting up a pilot study to genetically screen for NDDs, highlight key technical, practical, and ethical considerations and challenges, and examine the policy and health system implications.


Asunto(s)
Tamizaje Neonatal , Trastornos del Neurodesarrollo , Lactante , Recién Nacido , Humanos , Tamizaje Neonatal/métodos , Proyectos Piloto , Trastornos del Neurodesarrollo/diagnóstico , Trastornos del Neurodesarrollo/genética , Padres
7.
Nucleic Acids Res ; 47(D1): D475-D481, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30500919

RESUMEN

The SCOPe (Structural Classification of Proteins-extended, https://scop.berkeley.edu) database hierarchically classifies domains from the majority of proteins of known structure according to their structural and evolutionary relationships. SCOPe also incorporates and updates the ASTRAL compendium, which provides multiple databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. Protein structures are classified using a combination of manual curation and highly precise automated methods. In the current release of SCOPe, 2.07, we have focused our manual curation efforts on larger protein structures, including the spliceosome, proteasome and RNA polymerase I, as well as many other Pfam families that had not previously been classified. Domains from these large protein complexes are distinctive in several ways: novel non-globular folds are more common, and domains from previously observed protein families often have N- or C-terminal extensions that were disordered or not present in previous structures. The current monthly release update, SCOPe 2.07-2018-10-18, classifies 90 992 PDB entries (about two thirds of PDB entries).


Asunto(s)
Bases de Datos de Proteínas , Dominios Proteicos , Complejos Multiproteicos/química , Complejo de la Endopetidasa Proteasomal/química , Empalmosomas/química
8.
Nature ; 512(7515): 453-6, 2014 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-25164757

RESUMEN

Despite the large evolutionary distances between metazoan species, they can show remarkable commonalities in their biology, and this has helped to establish fly and worm as model organisms for human biology. Although studies of individual elements and factors have explored similarities in gene regulation, a large-scale comparative analysis of basic principles of transcriptional regulatory features is lacking. Here we map the genome-wide binding locations of 165 human, 93 worm and 52 fly transcription regulatory factors, generating a total of 1,019 data sets from diverse cell types, developmental stages, or conditions in the three species, of which 498 (48.9%) are presented here for the first time. We find that structural properties of regulatory networks are remarkably conserved and that orthologous regulatory factor families recognize similar binding motifs in vivo and show some similar co-associations. Our results suggest that gene-regulatory properties previously observed for individual factors are general principles of metazoan regulation that are remarkably well-preserved despite extensive functional divergence of individual network connections. The comparative maps of regulatory circuitry provided here will drive an improved understanding of the regulatory underpinnings of model organism biology and how these relate to human biology, development and disease.


Asunto(s)
Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Evolución Molecular , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Caenorhabditis elegans/crecimiento & desarrollo , Inmunoprecipitación de Cromatina , Secuencia Conservada/genética , Drosophila melanogaster/crecimiento & desarrollo , Regulación del Desarrollo de la Expresión Génica/genética , Genoma/genética , Humanos , Anotación de Secuencia Molecular , Motivos de Nucleótidos/genética , Especificidad de Órganos/genética , Factores de Transcripción/genética
9.
Hum Mutat ; 40(9): 1197-1201, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31334884

RESUMEN

Interpretation of genomic variation plays an essential role in the analysis of cancer and monogenic disease, and increasingly also in complex trait disease, with applications ranging from basic research to clinical decisions. Many computational impact prediction methods have been developed, yet the field lacks a clear consensus on their appropriate use and interpretation. The Critical Assessment of Genome Interpretation (CAGI, /'ka-je/) is a community experiment to objectively assess computational methods for predicting the phenotypic impacts of genomic variation. CAGI participants are provided genetic variants and make blind predictions of resulting phenotype. Independent assessors evaluate the predictions by comparing with experimental and clinical data. CAGI has completed five editions with the goals of establishing the state of art in genome interpretation and of encouraging new methodological developments. This special issue (https://onlinelibrary.wiley.com/toc/10981004/2019/40/9) comprises reports from CAGI, focusing on the fifth edition that culminated in a conference that took place 5 to 7 July 2018. CAGI5 was comprised of 14 challenges and engaged hundreds of participants from a dozen countries. This edition had a notable increase in splicing and expression regulatory variant challenges, while also continuing challenges on clinical genomics, as well as complex disease datasets and missense variants in diseases ranging from cancer to Pompe disease to schizophrenia. Full information about CAGI is at https://genomeinterpretation.org.


Asunto(s)
Biología Computacional/métodos , Genoma Humano , Algoritmos , Congresos como Asunto , Interpretación Estadística de Datos , Genómica , Humanos , Medicina de Precisión
10.
Hum Mutat ; 40(9): 1202-1214, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31283070

RESUMEN

Genome sequencing identifies vast number of genetic variants. Predicting these variants' molecular and clinical effects is one of the preeminent challenges in human genetics. Accurate prediction of the impact of genetic variants improves our understanding of how genetic information is conveyed to molecular and cellular functions, and is an essential step towards precision medicine. Over one hundred tools/resources have been developed specifically for this purpose. We summarize these tools as well as their characteristics, in the genetic Variant Impact Predictor Database (VIPdb). This database will help researchers and clinicians explore appropriate tools, and inform the development of improved methods. VIPdb can be browsed and downloaded at https://genomeinterpretation.org/vipdb.


Asunto(s)
Bases de Datos Genéticas , Variación Genética , Proteínas/química , Proteínas/genética , Biología Computacional , Predisposición Genética a la Enfermedad , Genoma Humano , Humanos , Fenotipo , Medicina de Precisión , Estructura Secundaria de Proteína , Interfaz Usuario-Computador
11.
Hum Mutat ; 40(9): 1330-1345, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31144778

RESUMEN

The Critical Assessment of Genome Interpretation-5 intellectual disability challenge asked to use computational methods to predict patient clinical phenotypes and the causal variant(s) based on an analysis of their gene panel sequence data. Sequence data for 74 genes associated with intellectual disability (ID) and/or autism spectrum disorders (ASD) from a cohort of 150 patients with a range of neurodevelopmental manifestations (i.e. ID, autism, epilepsy, microcephaly, macrocephaly, hypotonia, ataxia) have been made available for this challenge. For each patient, predictors had to report the causative variants and which of the seven phenotypes were present. Since neurodevelopmental disorders are characterized by strong comorbidity, tested individuals often present more than one pathological condition. Considering the overall clinical manifestation of each patient, the correct phenotype has been predicted by at least one group for 93 individuals (62%). ID and ASD were the best predicted among the seven phenotypic traits. Also, causative or potentially pathogenic variants were predicted correctly by at least one group. However, the prediction of the correct causative variant seems to be insufficient to predict the correct phenotype. In some cases, the correct prediction has been supported by rare or common variants in genes different from the causative one.


Asunto(s)
Trastorno del Espectro Autista/genética , Biología Computacional/métodos , Discapacidad Intelectual/genética , Análisis de Secuencia de ADN/métodos , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Fenotipo , Sitios de Carácter Cuantitativo
12.
Hum Mutat ; 40(9): 1474-1485, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31260570

RESUMEN

The CAGI-5 pericentriolar material 1 (PCM1) challenge aimed to predict the effect of 38 transgenic human missense mutations in the PCM1 protein implicated in schizophrenia. Participants were provided with 16 benign variants (negative controls), 10 hypomorphic, and 12 loss of function variants. Six groups participated and were asked to predict the probability of effect and standard deviation associated to each mutation. Here, we present the challenge assessment. Prediction performance was evaluated using different measures to conclude in a final ranking which highlights the strengths and weaknesses of each group. The results show a great variety of predictions where some methods performed significantly better than others. Benign variants played an important role as negative controls, highlighting predictors biased to identify disease phenotypes. The best predictor, Bromberg lab, used a neural-network-based method able to discriminate between neutral and non-neutral single nucleotide polymorphisms. The CAGI-5 PCM1 challenge allowed us to evaluate the state of the art techniques for interpreting the effect of novel variants for a difficult target protein.


Asunto(s)
Autoantígenos/genética , Proteínas de Ciclo Celular/genética , Biología Computacional/métodos , Mutación Missense , Esquizofrenia/genética , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad , Humanos , Redes Neurales de la Computación , Fenotipo , Polimorfismo de Nucleótido Simple
13.
Hum Mutat ; 40(9): 1314-1320, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31140652

RESUMEN

Genetics play a key role in venous thromboembolism (VTE) risk, however established risk factors in European populations do not translate to individuals of African descent because of the differences in allele frequencies between populations. As part of the fifth iteration of the Critical Assessment of Genome Interpretation, participants were asked to predict VTE status in exome data from African American subjects. Participants were provided with 103 unlabeled exomes from patients treated with warfarin for non-VTE causes or VTE and asked to predict which disease each subject had been treated for. Given the lack of training data, many participants opted to use unsupervised machine learning methods, clustering the exomes by variation in genes known to be associated with VTE. The best performing method using only VTE related genes achieved an area under the ROC curve of 0.65. Here, we discuss the range of methods used in the prediction of VTE from sequence data and explore some of the difficulties of conducting a challenge with known confounders. In addition, we show that an existing genetic risk score for VTE that was developed in European subjects works well in African Americans.


Asunto(s)
Secuenciación del Exoma/métodos , Tromboembolia Venosa/genética , Warfarina/administración & dosificación , Análisis por Conglomerados , Biología Computacional/métodos , Congresos como Asunto , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Curva ROC , Aprendizaje Automático no Supervisado , Tromboembolia Venosa/tratamiento farmacológico , Warfarina/uso terapéutico
14.
Hum Mutat ; 40(9): 1392-1399, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31209948

RESUMEN

Frataxin (FXN) is a highly conserved protein found in prokaryotes and eukaryotes that is required for efficient regulation of cellular iron homeostasis. Experimental evidence associates amino acid substitutions of the FXN to Friedreich Ataxia, a neurodegenerative disorder. Recently, new thermodynamic experiments have been performed to study the impact of somatic variations identified in cancer tissues on protein stability. The Critical Assessment of Genome Interpretation (CAGI) data provider at the University of Rome measured the unfolding free energy of a set of variants (FXN challenge data set) with far-UV circular dichroism and intrinsic fluorescence spectra. These values have been used to calculate the change in unfolding free energy between the variant and wild-type proteins at zero concentration of denaturant (ΔΔGH2O) . The FXN challenge data set, composed of eight amino acid substitutions, was used to evaluate the performance of the current computational methods for predicting the ΔΔGH2O value associated with the variants and to classify them as destabilizing and not destabilizing. For the fifth edition of CAGI, six independent research groups from Asia, Australia, Europe, and North America submitted 12 sets of predictions from different approaches. In this paper, we report the results of our assessment and discuss the limitations of the tested algorithms.


Asunto(s)
Sustitución de Aminoácidos , Proteínas de Unión a Hierro/química , Proteínas de Unión a Hierro/genética , Algoritmos , Dicroismo Circular , Humanos , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Estabilidad Proteica , Frataxina
15.
Hum Mutat ; 40(9): 1373-1391, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31322791

RESUMEN

Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.


Asunto(s)
Biología Computacional/métodos , Variación Genética , Enfermedades no Diagnosticadas/diagnóstico , Adolescente , Niño , Preescolar , Simulación por Computador , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Fenotipo , Enfermedades no Diagnosticadas/genética , Secuenciación Completa del Genoma
16.
Hum Mutat ; 40(9): 1530-1545, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31301157

RESUMEN

Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine-beta-synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges.


Asunto(s)
Sustitución de Aminoácidos , Biología Computacional/métodos , Cistationina betasintasa/genética , Cistationina/metabolismo , Cistationina betasintasa/metabolismo , Homocisteína/metabolismo , Humanos , Fenotipo , Medicina de Precisión
17.
Hum Mutat ; 40(9): 1519-1529, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31342580

RESUMEN

The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson's correlation coefficients of up to .61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population-scale analysis of disease epidemiology and rare variant association analysis.


Asunto(s)
Acetilglucosaminidasa/metabolismo , Biología Computacional/métodos , Mutación Missense , Acetilglucosaminidasa/genética , Humanos , Modelos Genéticos , Análisis de Regresión
18.
Hum Mutat ; 40(9): 1612-1622, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31241222

RESUMEN

The availability of disease-specific genomic data is critical for developing new computational methods that predict the pathogenicity of human variants and advance the field of precision medicine. However, the lack of gold standards to properly train and benchmark such methods is one of the greatest challenges in the field. In response to this challenge, the scientific community is invited to participate in the Critical Assessment for Genome Interpretation (CAGI), where unpublished disease variants are available for classification by in silico methods. As part of the CAGI-5 challenge, we evaluated the performance of 18 submissions and three additional methods in predicting the pathogenicity of single nucleotide variants (SNVs) in checkpoint kinase 2 (CHEK2) for cases of breast cancer in Hispanic females. As part of the assessment, the efficacy of the analysis method and the setup of the challenge were also considered. The results indicated that though the challenge could benefit from additional participant data, the combined generalized linear model analysis and odds of pathogenicity analysis provided a framework to evaluate the methods submitted for SNV pathogenicity identification and for comparison to other available methods. The outcome of this challenge and the approaches used can help guide further advancements in identifying SNV-disease relationships.


Asunto(s)
Neoplasias de la Mama/genética , Quinasa de Punto de Control 2/genética , Biología Computacional/métodos , Hispánicos o Latinos/genética , Polimorfismo de Nucleótido Simple , Adulto , Anciano , Neoplasias de la Mama/etnología , Estudios de Casos y Controles , Simulación por Computador , Femenino , Predisposición Genética a la Enfermedad , Humanos , Modelos Lineales , Persona de Mediana Edad , Estados Unidos/etnología , Secuenciación del Exoma
19.
N Engl J Med ; 375(22): 2165-2176, 2016 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-27959755

RESUMEN

BACKGROUND: Severe combined immunodeficiency (SCID) is characterized by arrested T-lymphocyte production and by B-lymphocyte dysfunction, which result in life-threatening infections. Early diagnosis of SCID through population-based screening of newborns can aid clinical management and help improve outcomes; it also permits the identification of previously unknown factors that are essential for lymphocyte development in humans. METHODS: SCID was detected in a newborn before the onset of infections by means of screening of T-cell-receptor excision circles, a biomarker for thymic output. On confirmation of the condition, the affected infant was treated with allogeneic hematopoietic stem-cell transplantation. Exome sequencing in the patient and parents was followed by functional analysis of a prioritized candidate gene with the use of human hematopoietic stem cells and zebrafish embryos. RESULTS: The infant had "leaky" SCID (i.e., a form of SCID in which a minimal degree of immune function is preserved), as well as craniofacial and dermal abnormalities and the absence of a corpus callosum; his immune deficit was fully corrected by hematopoietic stem-cell transplantation. Exome sequencing revealed a heterozygous de novo missense mutation, p.N441K, in BCL11B. The resulting BCL11B protein had dominant negative activity, which abrogated the ability of wild-type BCL11B to bind DNA, thereby arresting development of the T-cell lineage and disrupting hematopoietic stem-cell migration; this revealed a previously unknown function of BCL11B. The patient's abnormalities, when recapitulated in bcl11ba-deficient zebrafish, were reversed by ectopic expression of functionally intact human BCL11B but not mutant human BCL11B. CONCLUSIONS: Newborn screening facilitated the identification and treatment of a previously unknown cause of human SCID. Coupling exome sequencing with an evaluation of candidate genes in human hematopoietic stem cells and in zebrafish revealed that a constitutional BCL11B mutation caused human multisystem anomalies with SCID and also revealed a prethymic role for BCL11B in hematopoietic progenitors. (Funded by the National Institutes of Health and others.).


Asunto(s)
Anomalías Múltiples/genética , Células Madre Hematopoyéticas/fisiología , Mutación Missense , Proteínas Represoras/genética , Inmunodeficiencia Combinada Grave/genética , Proteínas Supresoras de Tumor/genética , Animales , Encéfalo/diagnóstico por imagen , Movimiento Celular , Modelos Animales de Enfermedad , Regulación de la Expresión Génica , Trasplante de Células Madre Hematopoyéticas , Células Madre Hematopoyéticas/metabolismo , Humanos , Técnicas In Vitro , Recién Nacido , Imagen por Resonancia Magnética , Masculino , Tamizaje Neonatal/métodos , Receptores de Antígenos de Linfocitos T , Proteínas Represoras/deficiencia , Proteínas Represoras/metabolismo , Proteínas Supresoras de Tumor/deficiencia , Proteínas Supresoras de Tumor/metabolismo , Pez Cebra/crecimiento & desarrollo
20.
PLoS Comput Biol ; 14(11): e1006494, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30408027

RESUMEN

Research in computational biology has given rise to a vast number of methods developed to solve scientific problems. For areas in which many approaches exist, researchers have a hard time deciding which tool to select to address a scientific challenge, as essentially all publications introducing a new method will claim better performance than all others. Not all of these claims can be correct. Equally, for this same reason, developers struggle to demonstrate convincingly that they created a new and superior algorithm or implementation. Moreover, the developer community often has difficulty discerning which new approaches constitute true scientific advances for the field. The obvious answer to this conundrum is to develop benchmarks-meaning standard points of reference that facilitate evaluating the performance of different tools-allowing both users and developers to compare multiple tools in an unbiased fashion.


Asunto(s)
Biología Computacional/métodos , Algoritmos , Área Bajo la Curva , Publicaciones
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA