Pesquisa | Portal de Pesquisa da BVS

1.

MultifacetedProtDB: a database of human proteins with multiple functions.

Bertolini, Elisa; Babbi, Giulia; Savojardo, Castrense; Martelli, Pier Luigi; Casadio, Rita.

Nucleic Acids Res ; 52(D1): D494-D501, 2024 Jan 05.

Artigo em Inglês | MEDLINE | ID: mdl-37791887

RESUMO

MultifacetedProtDB is a database of multifunctional human proteins deriving information from other databases, including UniProt, GeneCards, Human Protein Atlas (HPA), Human Phenotype Ontology (HPO) and MONDO. It collects under the label 'multifaceted' multitasking proteins addressed in literature as pleiotropic, multidomain, promiscuous (in relation to enzymes catalysing multiple substrates) and moonlighting (with two or more molecular functions), and difficult to be retrieved with a direct search in existing non-specific databases. The study of multifunctional proteins is an expanding research area aiming to elucidate the complexities of biological processes, particularly in humans, where multifunctional proteins play roles in various processes, including signal transduction, metabolism, gene regulation and cellular communication, and are often involved in disease insurgence and progression. The webserver allows searching by gene, protein and any associated structural and functional information, like available structures from PDB, structural models and interactors, using multiple filters. Protein entries are supplemented with comprehensive annotations including EC number, GO terms (biological pathways, molecular functions, and cellular components), pathways from Reactome, subcellular localization from UniProt, tissue and cell type expression from HPA, and associated diseases following MONDO, Orphanet and OMIM classification. MultiFacetedProtDB is freely available as a web server at: https://multifacetedprotdb.biocomp.unibo.it/.

Assuntos

Bases de Dados de Proteínas , Proteínas , Humanos , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Bases de Dados como Assunto

2.

Critical assessment of variant prioritization methods for rare disease diagnosis within the rare genomes project.

Stenton, Sarah L; O'Leary, Melanie C; Lemire, Gabrielle; VanNoy, Grace E; DiTroia, Stephanie; Ganesh, Vijay S; Groopman, Emily; O'Heir, Emily; Mangilog, Brian; Osei-Owusu, Ikeoluwa; Pais, Lynn S; Serrano, Jillian; Singer-Berk, Moriel; Weisburd, Ben; Wilson, Michael W; Austin-Tse, Christina; Abdelhakim, Marwa; Althagafi, Azza; Babbi, Giulia; Bellazzi, Riccardo; Bovo, Samuele; Carta, Maria Giulia; Casadio, Rita; Coenen, Pieter-Jan; De Paoli, Federica; Floris, Matteo; Gajapathy, Manavalan; Hoehndorf, Robert; Jacobsen, Julius O B; Joseph, Thomas; Kamandula, Akash; Katsonis, Panagiotis; Kint, Cyrielle; Lichtarge, Olivier; Limongelli, Ivan; Lu, Yulan; Magni, Paolo; Mamidi, Tarun Karthik Kumar; Martelli, Pier Luigi; Mulargia, Marta; Nicora, Giovanna; Nykamp, Keith; Pejaver, Vikas; Peng, Yisu; Pham, Thi Hong Cam; Podda, Maurizio S; Rao, Aditya; Rizzo, Ettore; Saipradeep, Vangala G; Savojardo, Castrense.

Hum Genomics ; 18(1): 44, 2024 Apr 29.

Artigo em Inglês | MEDLINE | ID: mdl-38685113

RESUMO

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.

Assuntos

Doenças Raras , Humanos , Doenças Raras/genética , Doenças Raras/diagnóstico , Genoma Humano/genética , Variação Genética/genética , Biologia Computacional/métodos , Fenótipo

3.

Huntingtin: A Protein with a Peculiar Solvent Accessible Surface.

Babbi, Giulia; Savojardo, Castrense; Martelli, Pier Luigi; Casadio, Rita.

Int J Mol Sci ; 22(6)2021 Mar 12.

Artigo em Inglês | MEDLINE | ID: mdl-33809039

RESUMO

Taking advantage of the last cryogenic electron microscopy structure of human huntingtin, we explored with computational methods its physicochemical properties, focusing on the solvent accessible surface of the protein and highlighting a quite interesting mix of hydrophobic and hydrophilic patterns, with the prevalence of the latter ones. We then evaluated the probability of exposed residues to be in contact with other proteins, discovering that they tend to cluster in specific regions of the protein. We then found that the remaining portions of the protein surface can contain calcium-binding sites that we propose here as putative mediators for the protein to interact with membranes. Our findings are justified in relation to the present knowledge of huntingtin functional annotation.

Assuntos

Cálcio/metabolismo , Biologia Computacional , Proteína Huntingtina/química , Proteínas/genética , Sítios de Ligação/genética , Humanos , Proteína Huntingtina/genética , Proteína Huntingtina/ultraestrutura , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Ligação Proteica/genética , Solventes/química , Propriedades de Superfície

4.

A Glance into MTHFR Deficiency at a Molecular Level.

Savojardo, Castrense; Babbi, Giulia; Baldazzi, Davide; Martelli, Pier Luigi; Casadio, Rita.

Int J Mol Sci ; 23(1)2021 Dec 23.

Artigo em Inglês | MEDLINE | ID: mdl-35008593

RESUMO

MTHFR deficiency still deserves an investigation to associate the phenotype to protein structure variations. To this aim, considering the MTHFR wild type protein structure, with a catalytic and a regulatory domain and taking advantage of state-of-the-art computational tools, we explore the properties of 72 missense variations known to be disease associated. By computing the thermodynamic ΔΔG change according to a consensus method that we recently introduced, we find that 61% of the disease-related variations destabilize the protein, are present both in the catalytic and regulatory domain and correspond to known biochemical deficiencies. The propensity of solvent accessible residues to be involved in protein-protein interaction sites indicates that most of the interacting residues are located in the regulatory domain, and that only three of them, located at the interface of the functional protein homodimer, are both disease-related and destabilizing. Finally, we compute the protein architecture with Hidden Markov Models, one from Pfam for the catalytic domain and the second computed in house for the regulatory domain. We show that patterns of disease-associated, physicochemical variation types, both in the catalytic and regulatory domains, are unique for the MTHFR deficiency when mapped into the protein architecture.

Assuntos

Homocistinúria/genética , Metilenotetra-Hidrofolato Redutase (NADPH2)/deficiência , Espasticidade Muscular/genética , Domínio Catalítico/genética , Humanos , Metilenotetra-Hidrofolato Redutase (NADPH2)/genética , Mapas de Interação de Proteínas/genética , Transtornos Psicóticos/genética

5.

Are machine learning based methods suited to address complex biological problems? Lessons from CAGI-5 challenges.

Savojardo, Castrense; Babbi, Giulia; Bovo, Samuele; Capriotti, Emidio; Martelli, Pier Luigi; Casadio, Rita.

Hum Mutat ; 40(9): 1455-1462, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31066146

RESUMO

In silico approaches are routinely adopted to predict the effects of genetic variants and their relation to diseases. The critical assessment of genome interpretation (CAGI) has established a common framework for the assessment of available predictors of variant effects on specific problems and our group has been an active participant of CAGI since its first edition. In this paper, we summarize our experience and lessons learned from the last edition of the experiment (CAGI-5). In particular, we analyze prediction performances of our tools on five CAGI-5 selected challenges grouped into three different categories: prediction of variant effects on protein stability, prediction of variant pathogenicity, and prediction of complex functional effects. For each challenge, we analyze in detail the performance of our tools, highlighting their potentialities and drawbacks. The aim is to better define the application boundaries of each tool.

Assuntos

Biologia Computacional/métodos , Variação Genética , Proteínas/química , Proteínas/genética , Algoritmos , Simulação por Computador , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Fenótipo , Estabilidade Proteica

6.

Assessing predictions on fitness effects of missense variants in calmodulin.

Zhang, Jing; Kinch, Lisa N; Cong, Qian; Katsonis, Panagiotis; Lichtarge, Olivier; Savojardo, Castrense; Babbi, Giulia; Martelli, Pier Luigi; Capriotti, Emidio; Casadio, Rita; Garg, Aditi; Pal, Debnath; Weile, Jochen; Sun, Song; Verby, Marta; Roth, Frederick P; Grishin, Nick V.

Hum Mutat ; 40(9): 1463-1473, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31283071

RESUMO

This paper reports the evaluation of predictions for the "CALM1" challenge in the fifth round of the Critical Assessment of Genome Interpretation held in 2018. In the challenge, the participants were asked to predict effects on yeast growth caused by missense variants of human calmodulin, a highly conserved protein in eukaryotic cells sensing calcium concentration. The performance of predictors implementing different algorithms and methods is similar. Most predictors are able to identify the deleterious or tolerated variants with modest accuracy, with a baseline predictor based purely on sequence conservation slightly outperforming the submitted predictions. Nevertheless, we think that the accuracy of predictions remains far from satisfactory, and the field awaits substantial improvements. The most poorly predicted variants in this round surround functional CALM1 sites that bind calcium or peptide, which suggests that better incorporation of structural analysis may help improve predictions.

Assuntos

Calmodulina/química , Calmodulina/genética , Biologia Computacional/métodos , Mutação de Sentido Incorreto , Leveduras/crescimento & desenvolvimento , Algoritmos , Sítios de Ligação , Cálcio/metabolismo , Calmodulina/metabolismo , Evolução Molecular , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Aptidão Genética , Humanos , Modelos Genéticos , Modelos Moleculares , Conformação Proteica , Engenharia de Proteínas , Leveduras/genética

7.

Assessment of methods for predicting the effects of PTEN and TPMT protein variants.

Pejaver, Vikas; Babbi, Giulia; Casadio, Rita; Folkman, Lukas; Katsonis, Panagiotis; Kundu, Kunal; Lichtarge, Olivier; Martelli, Pier Luigi; Miller, Maximilian; Moult, John; Pal, Lipika R; Savojardo, Castrense; Yin, Yizhou; Zhou, Yaoqi; Radivojac, Predrag; Bromberg, Yana.

Hum Mutat ; 40(9): 1495-1506, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31184403

RESUMO

Thermodynamic stability is a fundamental property shared by all proteins. Changes in stability due to mutation are a widespread molecular mechanism in genetic diseases. Methods for the prediction of mutation-induced stability change have typically been developed and evaluated on incomplete and/or biased data sets. As part of the Critical Assessment of Genome Interpretation, we explored the utility of high-throughput variant stability profiling (VSP) assay data as an alternative for the assessment of computational methods and evaluated state-of-the-art predictors against over 7,000 nonsynonymous variants from two proteins. We found that predictions were modestly correlated with actual experimental values. Predictors fared better when evaluated as classifiers of extreme stability effects. While different methods emerging as top performers depending on the metric, it is nontrivial to draw conclusions on their adoption or improvement. Our analyses revealed that only 16% of all variants in VSP assays could be confidently defined as stability-affecting. Furthermore, it is unclear as to what extent VSP abundance scores were reasonable proxies for the stability-related quantities that participating methods were designed to predict. Overall, our observations underscore the need for clearly defined objectives when developing and using both computational and experimental methods in the context of measuring variant impact.

Assuntos

Biologia Computacional/métodos , Metiltransferases/química , Mutação , PTEN Fosfo-Hidrolase/química , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Metiltransferases/genética , PTEN Fosfo-Hidrolase/genética , Estabilidade Proteica

8.

Performance of computational methods for the evaluation of pericentriolar material 1 missense variants in CAGI-5.

Monzon, Alexander Miguel; Carraro, Marco; Chiricosta, Luigi; Reggiani, Francesco; Han, James; Ozturk, Kivilcim; Wang, Yanran; Miller, Maximilian; Bromberg, Yana; Capriotti, Emidio; Savojardo, Castrense; Babbi, Giulia; Martelli, Pier L; Casadio, Rita; Katsonis, Panagiotis; Lichtarge, Olivier; Carter, Hannah; Kousi, Maria; Katsanis, Nicholas; Andreoletti, Gaia; Moult, John; Brenner, Steven E; Ferrari, Carlo; Leonardi, Emanuela; Tosatto, Silvio C E.

Hum Mutat ; 40(9): 1474-1485, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31260570

RESUMO

The CAGI-5 pericentriolar material 1 (PCM1) challenge aimed to predict the effect of 38 transgenic human missense mutations in the PCM1 protein implicated in schizophrenia. Participants were provided with 16 benign variants (negative controls), 10 hypomorphic, and 12 loss of function variants. Six groups participated and were asked to predict the probability of effect and standard deviation associated to each mutation. Here, we present the challenge assessment. Prediction performance was evaluated using different measures to conclude in a final ranking which highlights the strengths and weaknesses of each group. The results show a great variety of predictions where some methods performed significantly better than others. Benign variants played an important role as negative controls, highlighting predictors biased to identify disease phenotypes. The best predictor, Bromberg lab, used a neural-network-based method able to discriminate between neutral and non-neutral single nucleotide polymorphisms. The CAGI-5 PCM1 challenge allowed us to evaluate the state of the art techniques for interpreting the effect of novel variants for a difficult target protein.

Assuntos

Autoantígenos/genética , Proteínas de Ciclo Celular/genética , Biologia Computacional/métodos , Mutação de Sentido Incorreto , Esquizofrenia/genética , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Redes Neurais de Computação , Fenótipo , Polimorfismo de Nucleotídeo Único

9.

Evaluating the predictions of the protein stability change upon single amino acid substitutions for the FXN CAGI5 challenge.

Savojardo, Castrense; Petrosino, Maria; Babbi, Giulia; Bovo, Samuele; Corbi-Verge, Carles; Casadio, Rita; Fariselli, Piero; Folkman, Lukas; Garg, Aditi; Karimi, Mostafa; Katsonis, Panagiotis; Kim, Philip M; Lichtarge, Olivier; Martelli, Pier Luigi; Pasquo, Alessandra; Pal, Debnath; Shen, Yang; Strokach, Alexey V; Turina, Paola; Zhou, Yaoqi; Andreoletti, Gaia; Brenner, Steven E; Chiaraluce, Roberta; Consalvi, Valerio; Capriotti, Emidio.

Hum Mutat ; 40(9): 1392-1399, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31209948

RESUMO

Frataxin (FXN) is a highly conserved protein found in prokaryotes and eukaryotes that is required for efficient regulation of cellular iron homeostasis. Experimental evidence associates amino acid substitutions of the FXN to Friedreich Ataxia, a neurodegenerative disorder. Recently, new thermodynamic experiments have been performed to study the impact of somatic variations identified in cancer tissues on protein stability. The Critical Assessment of Genome Interpretation (CAGI) data provider at the University of Rome measured the unfolding free energy of a set of variants (FXN challenge data set) with far-UV circular dichroism and intrinsic fluorescence spectra. These values have been used to calculate the change in unfolding free energy between the variant and wild-type proteins at zero concentration of denaturant (ΔΔGH2O) . The FXN challenge data set, composed of eight amino acid substitutions, was used to evaluate the performance of the current computational methods for predicting the ΔΔGH2O value associated with the variants and to classify them as destabilizing and not destabilizing. For the fifth edition of CAGI, six independent research groups from Asia, Australia, Europe, and North America submitted 12 sets of predictions from different approaches. In this paper, we report the results of our assessment and discuss the limitations of the tested algorithms.

Assuntos

Substituição de Aminoácidos , Proteínas de Ligação ao Ferro/química , Proteínas de Ligação ao Ferro/genética , Algoritmos , Dicroísmo Circular , Humanos , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Estabilidade Proteica , Frataxina

10.

CAGI SickKids challenges: Assessment of phenotype and variant predictions derived from clinical and genomic data of children with undiagnosed diseases.

Kasak, Laura; Hunter, Jesse M; Udani, Rupa; Bakolitsa, Constantina; Hu, Zhiqiang; Adhikari, Aashish N; Babbi, Giulia; Casadio, Rita; Gough, Julian; Guerrero, Rafael F; Jiang, Yuxiang; Joseph, Thomas; Katsonis, Panagiotis; Kotte, Sujatha; Kundu, Kunal; Lichtarge, Olivier; Martelli, Pier Luigi; Mooney, Sean D; Moult, John; Pal, Lipika R; Poitras, Jennifer; Radivojac, Predrag; Rao, Aditya; Sivadasan, Naveen; Sunderam, Uma; Saipradeep, V G; Yin, Yizhou; Zaucha, Jan; Brenner, Steven E; Meyn, M Stephen.

Hum Mutat ; 40(9): 1373-1391, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31322791

RESUMO

Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.

Assuntos

Biologia Computacional/métodos , Variação Genética , Doenças não Diagnosticadas/diagnóstico , Adolescente , Criança , Pré-Escolar , Simulação por Computador , Bases de Dados Genéticas , Feminino , Predisposição Genética para Doença , Humanos , Masculino , Fenótipo , Doenças não Diagnosticadas/genética , Sequenciamento Completo do Genoma

11.

Assessment of blind predictions of the clinical significance of BRCA1 and BRCA2 variants.

Cline, Melissa S; Babbi, Giulia; Bonache, Sandra; Cao, Yue; Casadio, Rita; de la Cruz, Xavier; Díez, Orland; Gutiérrez-Enríquez, Sara; Katsonis, Panagiotis; Lai, Carmen; Lichtarge, Olivier; Martelli, Pier L; Mishne, Gilad; Moles-Fernández, Alejandro; Montalban, Gemma; Mooney, Sean D; O'Conner, Robert; Ootes, Lars; Özkan, Selen; Padilla, Natalia; Pagel, Kymberleigh A; Pejaver, Vikas; Radivojac, Predrag; Riera, Casandra; Savojardo, Castrense; Shen, Yang; Sun, Yuanfei; Topper, Scott; Parsons, Michael T; Spurdle, Amanda B; Goldgar, David E.

Hum Mutat ; 40(9): 1546-1556, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31294896

RESUMO

Testing for variation in BRCA1 and BRCA2 (commonly referred to as BRCA1/2), has emerged as a standard clinical practice and is helping countless women better understand and manage their heritable risk of breast and ovarian cancer. Yet the increased rate of BRCA1/2 testing has led to an increasing number of Variants of Uncertain Significance (VUS), and the rate of VUS discovery currently outpaces the rate of clinical variant interpretation. Computational prediction is a key component of the variant interpretation pipeline. In the CAGI5 ENIGMA Challenge, six prediction teams submitted predictions on 326 newly-interpreted variants from the ENIGMA Consortium. By evaluating these predictions against the new interpretations, we have gained a number of insights on the state of the art of variant prediction and specific steps to further advance this state of the art.

Assuntos

Proteína BRCA1/genética , Proteína BRCA2/genética , Neoplasias da Mama/diagnóstico , Biologia Computacional/métodos , Neoplasias Ovarianas/diagnóstico , Neoplasias da Mama/genética , Detecção Precoce de Câncer , Feminino , Predisposição Genética para Doença , Testes Genéticos , Variação Genética , Humanos , Modelos Genéticos , Neoplasias Ovarianas/genética

12.

Assessment of predicted enzymatic activity of α-N-acetylglucosaminidase variants of unknown significance for CAGI 2016.

Clark, Wyatt T; Kasak, Laura; Bakolitsa, Constantina; Hu, Zhiqiang; Andreoletti, Gaia; Babbi, Giulia; Bromberg, Yana; Casadio, Rita; Dunbrack, Roland; Folkman, Lukas; Ford, Colby T; Jones, David; Katsonis, Panagiotis; Kundu, Kunal; Lichtarge, Olivier; Martelli, Pier L; Mooney, Sean D; Nodzak, Conor; Pal, Lipika R; Radivojac, Predrag; Savojardo, Castrense; Shi, Xinghua; Zhou, Yaoqi; Uppal, Aneeta; Xu, Qifang; Yin, Yizhou; Pejaver, Vikas; Wang, Meng; Wei, Liping; Moult, John; Yu, Guoying Karen; Brenner, Steven E; LeBowitz, Jonathan H.

Hum Mutat ; 40(9): 1519-1529, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31342580

RESUMO

The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson's correlation coefficients of up to .61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population-scale analysis of disease epidemiology and rare variant association analysis.

Assuntos

Acetilglucosaminidase/metabolismo , Biologia Computacional/métodos , Mutação de Sentido Incorreto , Acetilglucosaminidase/genética , Humanos , Modelos Genéticos , Análise de Regressão

13.

Assessing the performance of in silico methods for predicting the pathogenicity of variants in the gene CHEK2, among Hispanic females with breast cancer.

Voskanian, Alin; Katsonis, Panagiotis; Lichtarge, Olivier; Pejaver, Vikas; Radivojac, Predrag; Mooney, Sean D; Capriotti, Emidio; Bromberg, Yana; Wang, Yanran; Miller, Max; Martelli, Pier Luigi; Savojardo, Castrense; Babbi, Giulia; Casadio, Rita; Cao, Yue; Sun, Yuanfei; Shen, Yang; Garg, Aditi; Pal, Debnath; Yu, Yao; Huff, Chad D; Tavtigian, Sean V; Young, Erin; Neuhausen, Susan L; Ziv, Elad; Pal, Lipika R; Andreoletti, Gaia; Brenner, Steven E; Kann, Maricel G.

Hum Mutat ; 40(9): 1612-1622, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31241222

RESUMO

The availability of disease-specific genomic data is critical for developing new computational methods that predict the pathogenicity of human variants and advance the field of precision medicine. However, the lack of gold standards to properly train and benchmark such methods is one of the greatest challenges in the field. In response to this challenge, the scientific community is invited to participate in the Critical Assessment for Genome Interpretation (CAGI), where unpublished disease variants are available for classification by in silico methods. As part of the CAGI-5 challenge, we evaluated the performance of 18 submissions and three additional methods in predicting the pathogenicity of single nucleotide variants (SNVs) in checkpoint kinase 2 (CHEK2) for cases of breast cancer in Hispanic females. As part of the assessment, the efficacy of the analysis method and the setup of the challenge were also considered. The results indicated that though the challenge could benefit from additional participant data, the combined generalized linear model analysis and odds of pathogenicity analysis provided a framework to evaluate the methods submitted for SNV pathogenicity identification and for comparison to other available methods. The outcome of this challenge and the approaches used can help guide further advancements in identifying SNV-disease relationships.

Assuntos

Neoplasias da Mama/genética , Quinase do Ponto de Checagem 2/genética , Biologia Computacional/métodos , Hispânico ou Latino/genética , Polimorfismo de Nucleotídeo Único , Adulto , Idoso , Neoplasias da Mama/etnologia , Estudos de Casos e Controles , Simulação por Computador , Feminino , Predisposição Genética para Doença , Humanos , Modelos Lineares , Pessoa de Meia-Idade , Estados Unidos/etnologia , Sequenciamento do Exoma

14.

PhenPath: a tool for characterizing biological functions underlying different phenotypes.

Babbi, Giulia; Martelli, Pier Luigi; Casadio, Rita.

BMC Genomics ; 20(Suppl 8): 548, 2019 Jul 16.

Artigo em Inglês | MEDLINE | ID: mdl-31307376

RESUMO

BACKGROUND: Many diseases are associated with complex patterns of symptoms and phenotypic manifestations. Parsimonious explanations aim at reconciling the multiplicity of phenotypic traits with the perturbation of one or few biological functions. For this, it is necessary to characterize human phenotypes at the molecular and functional levels, by exploiting gene annotations and known relations among genes, diseases and phenotypes. This characterization makes it possible to implement tools for retrieving functions shared among phenotypes, co-occurring in the same patient and facilitating the formulation of hypotheses about the molecular causes of the disease. RESULTS: We introduce PhenPath, a new resource consisting of two parts: PhenPathDB and PhenPathTOOL. The former is a database collecting the human genes associated with the phenotypes described in Human Phenotype Ontology (HPO) and OMIM Clinical Synopses. Phenotypes are then associated with biological functions and pathways by means of NET-GE, a network-based method for functional enrichment of sets of genes. The present version considers only phenotypes related to diseases. PhenPathDB collects information for 18 OMIM Clinical synopses and 7137 HPO phenotypes, related to 4292 diseases and 3446 genes. Enrichment of Gene Ontology annotations endows some 87.7, 86.9 and 73.6% of HPO phenotypes with Biological Process, Molecular Function and Cellular Component terms, respectively. Furthermore, 58.8 and 77.8% of HPO phenotypes are also enriched for KEGG and Reactome pathways, respectively. Based on PhenPathDB, PhenPathTOOL analyzes user-defined sets of phenotypes retrieving diseases, genes and functional terms which they share. This information can provide clues for interpreting the co-occurrence of phenotypes in a patient. CONCLUSIONS: The resource allows finding molecular features useful to investigate diseases characterized by multiple phenotypes, and by this, it can help researchers and physicians in identifying molecular mechanisms and biological functions underlying the concomitant manifestation of phenotypes. The resource is freely available at http://phenpath.biocomp.unibo.it .

Assuntos

Ontologias Biológicas , Biologia Computacional/métodos , Bases de Dados Genéticas , Fenótipo , Doença/genética , Humanos

15.

Functional and Structural Features of Disease-Related Protein Variants.

Savojardo, Castrense; Babbi, Giulia; Martelli, Pier Luigi; Casadio, Rita.

Int J Mol Sci ; 20(7)2019 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-30934684

RESUMO

Modern sequencing technologies provide an unprecedented amount of data of single-nucleotide variations occurring in coding regions and leading to changes in the expressed protein sequences. A significant fraction of these single-residue variations is linked to disease onset and collected in public databases. In recent years, many scientific studies have been focusing on the dissection of salient features of disease-related variations from different perspectives. In this work, we complement previous analyses by updating a dataset of disease-related variations occurring in proteins with 3D structure. Within this dataset, we describe functional and structural features that can be of interest for characterizing disease-related variations, including major chemico-physical properties, the strength of association to disease of variation types, their effect on protein stability, their location on the protein structure, and their distribution in Pfam structural/functional protein models. Our results support previous findings obtained in different data sets and introduce Pfam models as possible fingerprints of patterns of disease related single-nucleotide variations.

Assuntos

Doença/genética , Proteínas Mutantes/química , Proteínas Mutantes/metabolismo , Mutação/genética , Bases de Dados de Proteínas , Humanos , Domínios Proteicos , Solventes

16.

Mutant MYO1F alters the mitochondrial network and induces tumor proliferation in thyroid cancer.

Diquigiovanni, Chiara; Bergamini, Christian; Evangelisti, Cecilia; Isidori, Federica; Vettori, Andrea; Tiso, Natascia; Argenton, Francesco; Costanzini, Anna; Iommarini, Luisa; Anbunathan, Hima; Pagotto, Uberto; Repaci, Andrea; Babbi, Giulia; Casadio, Rita; Lenaz, Giorgio; Rhoden, Kerry J; Porcelli, Anna Maria; Fato, Romana; Bowcock, Anne; Seri, Marco; Romeo, Giovanni; Bonora, Elena.

Int J Cancer ; 143(7): 1706-1719, 2018 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-29672841

RESUMO

Familial aggregation is a significant risk factor for the development of thyroid cancer and familial non-medullary thyroid cancer (FNMTC) accounts for 5-7% of all NMTC. Whole exome sequencing analysis in the family affected by FNMTC with oncocytic features where our group previously identified a predisposing locus on chromosome 19p13.2, revealed a novel heterozygous mutation (c.400G > A, NM_012335; p.Gly134Ser) in exon 5 of MYO1F, mapping to the linkage locus. In the thyroid FRTL-5 cell model stably expressing the mutant MYO1F p.Gly134Ser protein, we observed an altered mitochondrial network, with increased mitochondrial mass and a significant increase in both intracellular and extracellular reactive oxygen species, compared to cells expressing the wild-type (wt) protein or carrying the empty vector. The mutation conferred a significant advantage in colony formation, invasion and anchorage-independent growth. These data were corroborated by in vivo studies in zebrafish, since we demonstrated that the mutant MYO1F p.Gly134Ser, when overexpressed, can induce proliferation in whole vertebrate embryos, compared to the wt one. MYO1F screening in additional 192 FNMTC families identified another variant in exon 7, which leads to exon skipping, and is predicted to alter the ATP-binding domain in MYO1F. Our study identified for the first time a role for MYO1F in NMTC.

Assuntos

Proliferação de Células , Embrião não Mamífero/patologia , Mitocôndrias/patologia , Mutação , Miosina Tipo I/genética , Câncer Papilífero da Tireoide/patologia , Neoplasias da Glândula Tireoide/patologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Animais , Apoptose , Células Cultivadas , Criança , Cromossomos Humanos Par 19 , Embrião não Mamífero/metabolismo , Feminino , Predisposição Genética para Doença , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Mitocôndrias/genética , Mitocôndrias/metabolismo , Miosina Tipo I/química , Miosina Tipo I/metabolismo , Consumo de Oxigênio , Linhagem , Conformação Proteica , Câncer Papilífero da Tireoide/genética , Câncer Papilífero da Tireoide/metabolismo , Neoplasias da Glândula Tireoide/genética , Neoplasias da Glândula Tireoide/metabolismo , Adulto Jovem , Peixe-Zebra

17.

Benchmarking predictions of allostery in liver pyruvate kinase in CAGI4.

Xu, Qifang; Tang, Qingling; Katsonis, Panagiotis; Lichtarge, Olivier; Jones, David; Bovo, Samuele; Babbi, Giulia; Martelli, Pier L; Casadio, Rita; Lee, Gyu Rie; Seok, Chaok; Fenton, Aron W; Dunbrack, Roland L.

Hum Mutat ; 38(9): 1123-1131, 2017 09.

Artigo em Inglês | MEDLINE | ID: mdl-28370845

RESUMO

The Critical Assessment of Genome Interpretation (CAGI) is a global community experiment to objectively assess computational methods for predicting phenotypic impacts of genomic variation. One of the 2015-2016 competitions focused on predicting the influence of mutations on the allosteric regulation of human liver pyruvate kinase. More than 30 different researchers accessed the challenge data. However, only four groups accepted the challenge. Features used for predictions ranged from evolutionary constraints, mutant site locations relative to active and effector binding sites, and computational docking outputs. Despite the range of expertise and strategies used by predictors, the best predictions were marginally greater than random for modified allostery resulting from mutations. In contrast, several groups successfully predicted which mutations severely reduced enzymatic activity. Nonetheless, poor predictions of allostery stands in stark contrast to the impression left by more than 700 PubMed entries identified using the identifiers "computational + allosteric." This contrast highlights a specialized need for new computational tools and utilization of benchmarks that focus on allosteric regulation.

Assuntos

Benchmarking/métodos , Piruvato Quinase/química , Piruvato Quinase/genética , Regulação Alostérica , Sítio Alostérico , Biologia Computacional/métodos , Bases de Dados Genéticas , Frutosedifosfatos/metabolismo , Humanos , Modelos Moleculares , Mutação , Piruvato Quinase/metabolismo

18.

Working toward precision medicine: Predicting phenotypes from exomes in the Critical Assessment of Genome Interpretation (CAGI) challenges.

Daneshjou, Roxana; Wang, Yanran; Bromberg, Yana; Bovo, Samuele; Martelli, Pier L; Babbi, Giulia; Lena, Pietro Di; Casadio, Rita; Edwards, Matthew; Gifford, David; Jones, David T; Sundaram, Laksshman; Bhat, Rajendra Rana; Li, Xiaolin; Pal, Lipika R; Kundu, Kunal; Yin, Yizhou; Moult, John; Jiang, Yuxiang; Pejaver, Vikas; Pagel, Kymberleigh A; Li, Biao; Mooney, Sean D; Radivojac, Predrag; Shah, Sohela; Carraro, Marco; Gasparini, Alessandra; Leonardi, Emanuela; Giollo, Manuel; Ferrari, Carlo; Tosatto, Silvio C E; Bachar, Eran; Azaria, Johnathan R; Ofran, Yanay; Unger, Ron; Niroula, Abhishek; Vihinen, Mauno; Chang, Billy; Wang, Maggie H; Franke, Andre; Petersen, Britt-Sabina; Pirooznia, Mehdi; Zandi, Peter; McCombie, Richard; Potash, James B; Altman, Russ B; Klein, Teri E; Hoskins, Roger A; Repo, Susanna; Brenner, Steven E.

Hum Mutat ; 38(9): 1182-1192, 2017 09.

Artigo em Inglês | MEDLINE | ID: mdl-28634997

RESUMO

Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype-phenotype relationships.

Assuntos

Transtorno Bipolar/genética , Doença de Crohn/genética , Sequenciamento do Exoma/métodos , Medicina de Precisão/métodos , Varfarina/uso terapêutico , Biologia Computacional/métodos , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Disseminação de Informação , Variantes Farmacogenômicos , Fenótipo , Varfarina/farmacologia

19.

eDGAR: a database of Disease-Gene Associations with annotated Relationships among genes.

Babbi, Giulia; Martelli, Pier Luigi; Profiti, Giuseppe; Bovo, Samuele; Savojardo, Castrense; Casadio, Rita.

BMC Genomics ; 18(Suppl 5): 554, 2017 08 11.

Artigo em Inglês | MEDLINE | ID: mdl-28812536

RESUMO

BACKGROUND: Genetic investigations, boosted by modern sequencing techniques, allow dissecting the genetic component of different phenotypic traits. These efforts result in the compilation of lists of genes related to diseases and show that an increasing number of diseases is associated with multiple genes. Investigating functional relations among genes associated with the same disease contributes to highlighting molecular mechanisms of the pathogenesis. RESULTS: We present eDGAR, a database collecting and organizing the data on gene/disease associations as derived from OMIM, Humsavar and ClinVar. For each disease-associated gene, eDGAR collects information on its annotation. Specifically, for lists of genes, eDGAR provides information on: i) interactions retrieved from PDB, BIOGRID and STRING; ii) co-occurrence in stable and functional structural complexes; iii) shared Gene Ontology annotations; iv) shared KEGG and REACTOME pathways; v) enriched functional annotations computed with NET-GE; vi) regulatory interactions derived from TRRUST; vii) localization on chromosomes and/or co-localisation in neighboring loci. The present release of eDGAR includes 2672 diseases, related to 3658 different genes, for a total number of 5729 gene-disease associations. 71% of the genes are linked to 621 multigenic diseases and eDGAR highlights their common GO terms, KEGG/REACTOME pathways, physical and regulatory interactions. eDGAR includes a network based enrichment method for detecting statistically significant functional terms associated to groups of genes. CONCLUSIONS: eDGAR offers a resource to analyze disease-gene associations. In multigenic diseases genes can share physical interactions and/or co-occurrence in the same functional processes. eDGAR is freely available at: edgar.biocomp.unibo.it.

Assuntos

Bases de Dados Genéticas , Doenças Genéticas Inatas/genética , Genômica/métodos , Mapas de Interação de Proteínas , Doenças Genéticas Inatas/metabolismo , Humanos , Redes e Vias Metabólicas , Anotação de Sequência Molecular

20.

Large scale analysis of protein stability in OMIM disease related human protein variants.

Martelli, Pier Luigi; Fariselli, Piero; Savojardo, Castrense; Babbi, Giulia; Aggazio, Francesco; Casadio, Rita.

BMC Genomics ; 17 Suppl 2: 397, 2016 06 23.

Artigo em Inglês | MEDLINE | ID: mdl-27356511

RESUMO

BACKGROUND: Modern genomic techniques allow to associate several Mendelian human diseases to single residue variations in different proteins. Molecular mechanisms explaining the relationship among genotype and phenotype are still under debate. Change of protein stability upon variation appears to assume a particular relevance in annotating whether a single residue substitution can or cannot be associated to a given disease. Thermodynamic properties of human proteins and of their disease related variants are lacking. In the present work, we take advantage of the available three dimensional structure of human proteins for predicting the role of disease related variations on the perturbation of protein stability. RESULTS: We develop INPS3D, a new predictor based on protein structure for computing the effect of single residue variations on protein stability (ΔΔG), scoring at the state-of-the-art (Pearson's correlation value of the regression is equal to 0.72 with mean standard error of 1.15 kcal/mol on a blind test set comprising 351 variations in 60 proteins). We then filter 368 OMIM disease related proteins known with atomic resolution (where the three dimensional structure covers at least 70 % of the sequence) with 4717 disease related single residue variations and 685 polymorphisms without clinical consequence. We find that the effect on protein stability of disease related variations is larger than the effect of polymorphisms: in particular, by setting to |1 kcal/mol| the threshold between perturbing and not perturbing variations of the protein stability, about 44 % of disease related variations and 20 % of polymorphisms are predicted with |ΔΔG| > 1 kcal/mol, respectively. A consistent fraction of OMIM disease related variations is however predicted to promote |ΔΔG| ≤ 1 kcal/mol and we focus here on detecting features that can be associated to the thermodynamic property of the protein variant. Our analysis reveals that some 47 % of disease related variations promoting |ΔΔG| ≤ 1 are located in solvent exposed sites of the protein structure. We also find that the increase of the fraction of variations that in proteins are predicted with |ΔΔG| ≤ 1 kcal/mol, partially relates with the increasing number of the protein interacting partners, corroborating the notion that disease related, non-perturbing variations are likely to impair protein-protein interaction (70 % of the disease causing variations, with high accessible surface are indeed predicted in interacting sites). The set of OMIM surface accessible variations with |ΔΔG| ≤ 1 kcal/mol and located in interaction sites are 23 % of the total in 161 proteins. Among these, 43 proteins with some 327 disease causing variations are involved in signalling, structural biological processes, development and differentiation. CONCLUSIONS: We compute the effect of disease causing variations on protein stability with INPS3D, a new state-of-the-art tool for predicting the change in ΔΔG value associated to single residue substitution in protein structures. The analysis indicates that OMIM disease related variations in proteins promote a much larger effect on protein stability than polymorphisms non-associated to diseases. Disease related variations with a slight effect on protein stability (|ΔΔG| < 1 kcal/mol) frequently occur at the protein accessible surface suggesting that they are located in protein-protein interactions patches in putative human biological functional networks. The hypothesis is corroborated by proving that proteins with many disease related variations that slightly perturb protein stability are on average more connected in the human physical interactome (IntAct) than proteins with variations predicted with |ΔΔG| > 1 kcal/mol.

Assuntos

Bases de Dados Genéticas , Proteínas/química , Proteínas/genética , Variação Genética , Humanos , Conformação Proteica , Dobramento de Proteína , Estabilidade Proteica , Termodinâmica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA