Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Insect Mol Biol ; 32(2): 118-131, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36366787

RESUMO

Termites (Insecta, Blattodea, Termitoidae) are a widespread and diverse group of eusocial insects known for their ability to digest wood matter. Herein, we report the draft genome of the subterranean termite Reticulitermes lucifugus, an economically important species and among the most studied taxa with respect to eusocial organization and mating system. The final assembly (~813 Mb) covered up to 88% of the estimated genome size and, in agreement with the Asexual Queen Succession Mating System, it was found completely homozygous. We predicted 16,349 highly supported gene models and 42% of repetitive DNA content. Transposable elements of R. lucifugus show similar evolutionary dynamics compared to that of other termites, with two main peaks of activity localized at 25% and 8% of Kimura divergence driven by DNA, LINE and SINE elements. Gene family turnover analyses identified multiple instances of gene duplication associated with R. lucifugus diversification, with significant lineage-specific gene family expansions related to development, perception and nutrient metabolism pathways. Finally, we analysed P450 and odourant receptor gene repertoires in detail, highlighting the large diversity and dynamical evolutionary history of these proteins in the R. lucifugus genome. This newly assembled genome will provide a valuable resource for further understanding the molecular basis of termites biology as well as for pest control.


Assuntos
Baratas , Isópteros , Animais , Isópteros/genética , Madeira , Evolução Biológica , Reprodução
2.
Int J Mol Sci ; 23(1)2021 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-35008593

RESUMO

MTHFR deficiency still deserves an investigation to associate the phenotype to protein structure variations. To this aim, considering the MTHFR wild type protein structure, with a catalytic and a regulatory domain and taking advantage of state-of-the-art computational tools, we explore the properties of 72 missense variations known to be disease associated. By computing the thermodynamic ΔΔG change according to a consensus method that we recently introduced, we find that 61% of the disease-related variations destabilize the protein, are present both in the catalytic and regulatory domain and correspond to known biochemical deficiencies. The propensity of solvent accessible residues to be involved in protein-protein interaction sites indicates that most of the interacting residues are located in the regulatory domain, and that only three of them, located at the interface of the functional protein homodimer, are both disease-related and destabilizing. Finally, we compute the protein architecture with Hidden Markov Models, one from Pfam for the catalytic domain and the second computed in house for the regulatory domain. We show that patterns of disease-associated, physicochemical variation types, both in the catalytic and regulatory domains, are unique for the MTHFR deficiency when mapped into the protein architecture.


Assuntos
Homocistinúria/genética , Metilenotetra-Hidrofolato Redutase (NADPH2)/deficiência , Espasticidade Muscular/genética , Domínio Catalítico/genética , Humanos , Metilenotetra-Hidrofolato Redutase (NADPH2)/genética , Mapas de Interação de Proteínas/genética , Transtornos Psicóticos/genética
3.
Biomed Res Int ; 2020: 7465242, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32258141

RESUMO

Recent comparisons between plant and animal viruses reveal many common principles that underlie how all viruses express their genetic material, amplify their genomes, and link virion assembly with replication. Cauliflower mosaic virus (CaMV) is not infectious for human beings. Here, we show that CaMV transactivator/viroplasmin protein (TAV) shares sequence similarity with and behaves like the human ribonuclease H1 (RNase H1) in reducing DNA/RNA hybrids detected with S9.6 antibody in HEK293T cells. We showed that TAV is clearly expressed in the cytosol and in the nuclei of transiently transfected human cells, similar to its distribution in plants. TAV also showed remarkable cytotoxic effects in U251 human glioma cells in vitro. These characteristics pave the way for future analysis on the use of the plant virus protein TAV, as an alternative to human RNAse H1 during gene therapy in human cells.


Assuntos
Caulimovirus/enzimologia , Glioma/tratamento farmacológico , Ribonuclease H , Proteínas Virais , Linhagem Celular Tumoral , Citotoxinas/química , Citotoxinas/farmacologia , Glioma/metabolismo , Glioma/patologia , Células HEK293 , Humanos , Ribonuclease H/química , Ribonuclease H/farmacologia , Proteínas Virais/química , Proteínas Virais/farmacologia
4.
Front Mol Biosci ; 7: 626363, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33490109

RESUMO

Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one.

5.
Hum Mutat ; 40(9): 1530-1545, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31301157

RESUMO

Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine-beta-synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges.


Assuntos
Substituição de Aminoácidos , Biologia Computacional/métodos , Cistationina beta-Sintase/genética , Cistationina/metabolismo , Cistationina beta-Sintase/metabolismo , Homocisteína/metabolismo , Humanos , Fenótipo , Medicina de Precisão
6.
Hum Mutat ; 40(9): 1612-1622, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31241222

RESUMO

The availability of disease-specific genomic data is critical for developing new computational methods that predict the pathogenicity of human variants and advance the field of precision medicine. However, the lack of gold standards to properly train and benchmark such methods is one of the greatest challenges in the field. In response to this challenge, the scientific community is invited to participate in the Critical Assessment for Genome Interpretation (CAGI), where unpublished disease variants are available for classification by in silico methods. As part of the CAGI-5 challenge, we evaluated the performance of 18 submissions and three additional methods in predicting the pathogenicity of single nucleotide variants (SNVs) in checkpoint kinase 2 (CHEK2) for cases of breast cancer in Hispanic females. As part of the assessment, the efficacy of the analysis method and the setup of the challenge were also considered. The results indicated that though the challenge could benefit from additional participant data, the combined generalized linear model analysis and odds of pathogenicity analysis provided a framework to evaluate the methods submitted for SNV pathogenicity identification and for comparison to other available methods. The outcome of this challenge and the approaches used can help guide further advancements in identifying SNV-disease relationships.


Assuntos
Neoplasias da Mama/genética , Quinase do Ponto de Checagem 2/genética , Biologia Computacional/métodos , Hispânico ou Latino/genética , Polimorfismo de Nucleotídeo Único , Adulto , Idoso , Neoplasias da Mama/etnologia , Estudos de Casos e Controles , Simulação por Computador , Feminino , Predisposição Genética para Doença , Humanos , Modelos Lineares , Pessoa de Meia-Idade , Estados Unidos/etnologia , Sequenciamento do Exoma
7.
Hum Mutat ; 40(9): 1392-1399, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31209948

RESUMO

Frataxin (FXN) is a highly conserved protein found in prokaryotes and eukaryotes that is required for efficient regulation of cellular iron homeostasis. Experimental evidence associates amino acid substitutions of the FXN to Friedreich Ataxia, a neurodegenerative disorder. Recently, new thermodynamic experiments have been performed to study the impact of somatic variations identified in cancer tissues on protein stability. The Critical Assessment of Genome Interpretation (CAGI) data provider at the University of Rome measured the unfolding free energy of a set of variants (FXN challenge data set) with far-UV circular dichroism and intrinsic fluorescence spectra. These values have been used to calculate the change in unfolding free energy between the variant and wild-type proteins at zero concentration of denaturant (ΔΔGH2O) . The FXN challenge data set, composed of eight amino acid substitutions, was used to evaluate the performance of the current computational methods for predicting the ΔΔGH2O value associated with the variants and to classify them as destabilizing and not destabilizing. For the fifth edition of CAGI, six independent research groups from Asia, Australia, Europe, and North America submitted 12 sets of predictions from different approaches. In this paper, we report the results of our assessment and discuss the limitations of the tested algorithms.


Assuntos
Substituição de Aminoácidos , Proteínas de Ligação ao Ferro/química , Proteínas de Ligação ao Ferro/genética , Algoritmos , Dicroísmo Circular , Humanos , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Estabilidade Proteica , Frataxina
8.
Hum Mutat ; 40(9): 1495-1506, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31184403

RESUMO

Thermodynamic stability is a fundamental property shared by all proteins. Changes in stability due to mutation are a widespread molecular mechanism in genetic diseases. Methods for the prediction of mutation-induced stability change have typically been developed and evaluated on incomplete and/or biased data sets. As part of the Critical Assessment of Genome Interpretation, we explored the utility of high-throughput variant stability profiling (VSP) assay data as an alternative for the assessment of computational methods and evaluated state-of-the-art predictors against over 7,000 nonsynonymous variants from two proteins. We found that predictions were modestly correlated with actual experimental values. Predictors fared better when evaluated as classifiers of extreme stability effects. While different methods emerging as top performers depending on the metric, it is nontrivial to draw conclusions on their adoption or improvement. Our analyses revealed that only 16% of all variants in VSP assays could be confidently defined as stability-affecting. Furthermore, it is unclear as to what extent VSP abundance scores were reasonable proxies for the stability-related quantities that participating methods were designed to predict. Overall, our observations underscore the need for clearly defined objectives when developing and using both computational and experimental methods in the context of measuring variant impact.


Assuntos
Biologia Computacional/métodos , Metiltransferases/química , Mutação , PTEN Fosfo-Hidrolase/química , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Metiltransferases/genética , PTEN Fosfo-Hidrolase/genética , Estabilidade Proteica
9.
Sci Rep ; 9(1): 2172, 2019 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-30778083

RESUMO

Platelet-Derived Growth Factor Receptor Alpha (PDGFRA) mutations occur in approximately 5-7% of gastrointestinal stromal tumours (GIST). Over half of all PDGFRA mutations are represented by the substitution at position 842 in the A-loop of an aspartic acid (D) with a valine (V), recognized as D842V, conferring primary resistance to imatinib in vitro and in clinical observations due to the conformation of the kinase domain, which negatively affects imatinib binding. The lack of interaction between imatinib and the D842V PDGFRA mutated model has been established and widely confirmed in vivo. However, for the other PDGFRA mutations, the correlation between pre-clinical and clinical data is still unclear. An in silico evaluation of the p.His845_Asn848delinsPro mutation involving exon 18 of PDGFRA in a metastatic GIST patient responding to first-line imatinib has been provided. Docking analyses were performed, and the ligand-receptor interactions were evaluated with the jCE algorithm for structural alignment. The docking simulation and structural superimposition analysis show that PDGFRA p.His845_Asn848delinsPro stabilizes the imatinib binding site with the residues that are conserved in KIT. The in vivo evidence that PDGFRA p.His845_Asn848delinsPro is sensitive to imatinib was confirmed by the molecular modelling, which may represent a reliable tool for the prediction of clinical outcomes and treatment selection in GIST, especially for rare mutations.


Assuntos
Neoplasias Gastrointestinais/enzimologia , Neoplasias Gastrointestinais/genética , Tumores do Estroma Gastrointestinal/enzimologia , Tumores do Estroma Gastrointestinal/genética , Mutação INDEL , Proteínas Mutantes/química , Proteínas Mutantes/genética , Receptor alfa de Fator de Crescimento Derivado de Plaquetas/química , Receptor alfa de Fator de Crescimento Derivado de Plaquetas/genética , Idoso , Antineoplásicos/farmacocinética , Antineoplásicos/uso terapêutico , Sítios de Ligação , Simulação por Computador , Éxons , Feminino , Neoplasias Gastrointestinais/tratamento farmacológico , Tumores do Estroma Gastrointestinal/tratamento farmacológico , Humanos , Mesilato de Imatinib/farmacocinética , Mesilato de Imatinib/uso terapêutico , Modelos Moleculares , Simulação de Acoplamento Molecular , Proteínas Mutantes/metabolismo , Conformação Proteica , Receptor alfa de Fator de Crescimento Derivado de Plaquetas/metabolismo
10.
Hum Mutat ; 38(9): 1042-1050, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28440912

RESUMO

Correct phenotypic interpretation of variants of unknown significance for cancer-associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next-generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype-phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin-dependent kinase inhibitor encoded by the CDKN2A gene. Twenty-two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test-set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants.


Assuntos
Biologia Computacional/métodos , Inibidor de Quinase Dependente de Ciclina p18/genética , Variação Genética , Linhagem Celular Tumoral , Proliferação de Células , Simulação por Computador , Inibidor p16 de Quinase Dependente de Ciclina , Inibidor de Quinase Dependente de Ciclina p18/química , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Estabilidade Proteica
11.
Oncotarget ; 8(14): 22640-22648, 2017 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-28186987

RESUMO

We have investigated the clinical significance of the BRCA1 variant p.His1673del in 14 families from the Emilia-Romagna region of Italy, including 20 breast and 23 ovarian cancer cases; four families displayed site-specific ovarian cancer.The variant, absent in human variation databases, has been reported three times in BRCA1 specific databases; all probands shared the same rare haplotype at the BRCA1 locus, consistent with a common ancestor.The multifactorial likelihood method by Goldgar, used to estimate the probability of the variant being causative, gave a ratio of 2,263,474:1 in favor of causality. Moreover, in silico modeling suggested that His1673-lacking BRCA1 protein may have a decreased ability to bind BARD1 and other related proteins. All six ovarian carcinomas and two out of four breast carcinomas available showed a loss of the BRCA1 wild-type allele, which in three out of four ovarian carcinomas analyzed by FISH was associated with duplication of the chromosome 17 containing the variant. Although the pathogenicity of the allele is strongly supported by the multifactorial ratio,we cannot exclude that p.His1673del is not itself deleterious, but is linked to another undetected mutation on the same ancestral allele.


Assuntos
Proteína BRCA1/genética , Biomarcadores Tumorais/genética , Neoplasias da Mama/genética , Neoplasias Ovarianas/genética , Deleção de Sequência/genética , Adulto , Idoso , Neoplasias da Mama/patologia , Feminino , Predisposição Genética para Doença , Testes Genéticos , Haplótipos/genética , Humanos , Perda de Heterozigosidade , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Neoplasias Ovarianas/patologia , Linhagem , Fenótipo , Prognóstico
12.
Hum Mutat ; 38(9): 1064-1071, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28102005

RESUMO

SNPs&GO is a machine learning method for predicting the association of single amino acid variations (SAVs) to disease, considering protein functional annotation. The method is a binary classifier that implements a support vector machine algorithm to discriminate between disease-related and neutral SAVs. SNPs&GO combines information from protein sequence with functional annotation encoded by gene ontology (GO) terms. Tested in sequence mode on more than 38,000 SAVs from the SwissVar dataset, our method reached 81% overall accuracy and an area under the receiving operating characteristic curve of 0.88 with low false-positive rate. In almost all the editions of the Critical Assessment of Genome Interpretation (CAGI) experiments, SNPs&GO ranked among the most accurate algorithms for predicting the effect of SAVs. In this paper, we summarize the best results obtained by SNPs&GO on disease-related variations of four CAGI challenges relative to the following genes: CHEK2 (CAGI 2010), RAD50 (CAGI 2011), p16-INK (CAGI 2013), and NAGLU (CAGI 2016). Result evaluation provides insights about the accuracy of our algorithm and the relevance of GO terms in annotating the effect of the variants. It also helps to define good practices for the detection of deleterious SAVs.


Assuntos
Substituição de Aminoácidos , Quinase do Ponto de Checagem 2/genética , Biologia Computacional/métodos , Inibidor p16 de Quinase Dependente de Ciclina/genética , Enzimas Reparadoras do DNA/genética , Proteínas de Ligação a DNA/genética , alfa-N-Acetilgalactosaminidase/genética , Hidrolases Anidrido Ácido , Algoritmos , Ontologia Genética , Predisposição Genética para Doença , Humanos , Anotação de Sequência Molecular , Curva ROC , Máquina de Vetores de Suporte
13.
Bioinformatics ; 31(20): 3269-75, 2015 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-26079349

RESUMO

MOTIVATION: Molecular recognition of N-terminal targeting peptides is the most common mechanism controlling the import of nuclear-encoded proteins into mitochondria and chloroplasts. When experimental information is lacking, computational methods can annotate targeting peptides, and determine their cleavage sites for characterizing protein localization, function, and mature protein sequences. The problem of discriminating mitochondrial from chloroplastic propeptides is particularly relevant when annotating proteomes of photosynthetic Eukaryotes, endowed with both types of sequences. RESULTS: Here, we introduce TPpred3, a computational method that given any Eukaryotic protein sequence performs three different tasks: (i) the detection of targeting peptides; (ii) their classification as mitochondrial or chloroplastic and (iii) the precise localization of the cleavage sites in an organelle-specific framework. Our implementation is based on our TPpred previously introduced. Here, we integrate a new N-to-1 Extreme Learning Machine specifically designed for the classification task (ii). For the last task, we introduce an organelle-specific Support Vector Machine that exploits sequence motifs retrieved with an extensive motif-discovery analysis of a large set of mitochondrial and chloroplastic proteins. We show that TPpred3 outperforms the state-of-the-art methods in all the three tasks. AVAILABILITY AND IMPLEMENTATION: The method server and datasets are available at http://tppred3.biocomp.unibo.it. CONTACT: gigi@biocomp.unibo.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Cloroplastos/química , Aprendizado de Máquina , Proteínas Mitocondriais/química , Análise de Sequência de Proteína/métodos , Proteínas de Cloroplastos/metabolismo , Cloroplastos/metabolismo , Eucariotos/metabolismo , Mitocôndrias/metabolismo , Proteínas Mitocondriais/metabolismo , Peptídeos/química , Transporte Proteico , Software , Máquina de Vetores de Suporte
14.
Bioinformatics ; 31(17): 2816-21, 2015 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-25957347

RESUMO

MOTIVATION: A tool for reliably predicting the impact of variations on protein stability is extremely important for both protein engineering and for understanding the effects of Mendelian and somatic mutations in the genome. Next Generation Sequencing studies are constantly increasing the number of protein sequences. Given the huge disproportion between protein sequences and structures, there is a need for tools suited to annotate the effect of mutations starting from protein sequence without relying on the structure. Here, we describe INPS, a novel approach for annotating the effect of non-synonymous mutations on the protein stability from its sequence. INPS is based on SVM regression and it is trained to predict the thermodynamic free energy change upon single-point variations in protein sequences. RESULTS: We show that INPS performs similarly to the state-of-the-art methods based on protein structure when tested in cross-validation on a non-redundant dataset. INPS performs very well also on a newly generated dataset consisting of a number of variations occurring in the tumor suppressor protein p53. Our results suggest that INPS is a tool suited for computing the effect of non-synonymous polymorphisms on protein stability when the protein structure is not available. We also show that INPS predictions are complementary to those of the state-of-the-art, structure-based method mCSM. When the two methods are combined, the overall prediction on the p53 set scores significantly higher than those of the single methods. AVAILABILITY AND IMPLEMENTATION: The presented method is available as web server at http://inps.biocomp.unibo.it. CONTACT: piero.fariselli@unibo.it SUPPLEMENTARY INFORMATION: Supplementary Materials are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação/genética , Estabilidade Proteica , Proteínas/química , Software , Proteína Supressora de Tumor p53/química , Algoritmos , Humanos , Aprendizado de Máquina , Engenharia de Proteínas , Proteínas/genética , Termodinâmica , Proteína Supressora de Tumor p53/genética
15.
Methods Mol Biol ; 1264: 305-20, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25631024

RESUMO

Computational methods are invaluable when protein sequences, directly derived from genomic data, need functional and structural annotation. Subcellular localization is a feature necessary for understanding the protein role and the compartment where the mature protein is active and very difficult to characterize experimentally. Mitochondrial proteins encoded on the cytosolic ribosomes carry specific patterns in the precursor sequence from where it is possible to recognize a peptide targeting the protein to its final destination. Here we discuss to which extent it is feasible to develop computational methods for detecting mitochondrial targeting peptides in the precursor sequences and benchmark our and other methods on the human mitochondrial proteins endowed with experimentally characterized targeting peptides. Furthermore, we illustrate our newly implemented web server and its usage on the whole human proteome in order to infer mitochondrial targeting peptides, their cleavage sites, and whether the targeting peptide regions contain or not arginine-rich recurrent motifs. By this, we add some other 2,800 human proteins to the 124 ones already experimentally annotated with a mitochondrial targeting peptide.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Mitocôndrias/metabolismo , Proteínas Mitocondriais/química , Proteínas Mitocondriais/metabolismo , Modelos Biológicos , Peptídeos/química , Peptídeos/metabolismo , Conjuntos de Dados como Assunto , Genômica/métodos , Humanos , Internet , Transporte Proteico
16.
Protein Pept Lett ; 21(8): 840-6, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-23855670

RESUMO

Background. Obesity is a major health problem in both developed and emerging countries. Obesity is a complex disease whose etiology involves genetic factors in strong interplay with environmental determinants and lifestyle. The discovery of genetic factors and biological pathways underlying human obesity is hampered by the difficulty in controlling the genetic background of human cohorts. Animal models are then necessary to further dissect the genetics of obesity. Pig has emerged as one of the most attractive models, because of the similarity with humans in the mechanisms regulating the fat deposition. Results. We collected the genes related to obesity in humans and to fat deposition traits in pig. We localized them on both human and pig genomes, building a map useful to interpret comparative studies on obesity. We characterized the collected genes structurally and functionally with BAR+ and mapped them on KEGG pathways and on STRING protein interaction network. Conclusions. The collected set consists of 361 obesity related genes in human and pig genomes. All genes were mapped on the human genome, and 54 could not be localized on the pig genome (release 2012). Only for 3 human genes there is no counterpart in pig, confirming that this animal is a good model for human obesity studies. Obesity related genes are mostly involved in regulation and signaling processes/pathways and relevant connection emerges between obesity-related genes and diseases such as cancer and infectious diseases.


Assuntos
Mapeamento Cromossômico , Genoma Humano/genética , Anotação de Sequência Molecular , Obesidade/genética , Suínos/genética , Animais , Humanos , Mapeamento de Interação de Proteínas
17.
BMC Bioinformatics ; 14 Suppl 1: S10, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23368835

RESUMO

BACKGROUND: Recently, information derived by correlated mutations in proteins has regained relevance for predicting protein contacts. This is due to new forms of mutual information analysis that have been proven to be more suitable to highlight direct coupling between pairs of residues in protein structures and to the large number of protein chains that are currently available for statistical validation. It was previously discussed that disulfide bond topology in proteins is also constrained by correlated mutations. RESULTS: In this paper we exploit information derived from a corrected mutual information analysis and from the inverse of the covariance matrix to address the problem of the prediction of the topology of disulfide bonds in Eukaryotes. Recently, we have shown that Support Vector Regression (SVR) can improve the prediction for the disulfide connectivity patterns. Here we show that the inclusion of the correlated mutation information increases of 5 percentage points the SVR performance (from 54% to 59%). When this approach is used in combination with a method previously developed by us and scoring at the state of art in predicting both location and topology of disulfide bonds in Eukaryotes (DisLocate), the per-protein accuracy is 38%, 2 percentage points higher than that previously obtained. CONCLUSIONS: In this paper we show that the inclusion of information derived from correlated mutations can improve the performance of the state of the art methods for predicting disulfide connectivity patterns in Eukaryotic proteins. Our analysis also provides support to the notion that improving methods to extract evolutionary information from multiple sequence alignments greatly contributes to the scoring performance of predictors suited to detect relevant features from protein chains.


Assuntos
Inteligência Artificial , Dissulfetos/química , Mutação , Proteínas/química , Proteínas/genética , Algoritmos , Sequência de Aminoácidos , Cisteína/química , Alinhamento de Sequência
18.
Bioinformatics ; 29(8): 981-8, 2013 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-23428638

RESUMO

MOTIVATION: Targeting peptides are the most important signal controlling the import of nuclear encoded proteins into mitochondria and plastids. In the lack of experimental information, their prediction is an essential step when proteomes are annotated for inferring both the localization and the sequence of mature proteins. RESULTS: We developed TPpred a new predictor of organelle-targeting peptides based on Grammatical-Restrained Hidden Conditional Random Fields. TPpred is trained on a non-redundant dataset of proteins where the presence of a target peptide was experimentally validated, comprising 297 sequences. When tested on the 297 positive and some other 8010 negative examples, TPpred outperformed available methods in both accuracy and Matthews correlation index (96% and 0.58, respectively). Given its very low-false-positive rate (3.0%), TPpred is, therefore, well suited for large-scale analyses at the proteome level. We predicted that from ∼4 to 9% of the sequences of human, Arabidopsis thaliana and yeast proteomes contain targeting peptides and are, therefore, likely to be localized in mitochondria and plastids. TPpred predictions correlate to a good extent with the experimental annotation of the subcellular localization, when available. TPpred was also trained and tested to predict the cleavage site of the organelle-targeting peptide: on this task, the average error of TPpred on mitochondrial and plastidic proteins is 7 and 15 residues, respectively. This value is lower than the error reported by other methods currently available. AVAILABILITY: The TPpred datasets are available at http://biocomp.unibo.it/valentina/TPpred/. TPpred is available on request from the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Cloroplastos/química , Proteínas Mitocondriais/química , Sinais Direcionadores de Proteínas , Análise de Sequência de Proteína/métodos , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/metabolismo , Proteínas de Cloroplastos/metabolismo , Eucariotos , Humanos , Mitocôndrias/metabolismo , Proteínas Mitocondriais/metabolismo , Peptídeos/química , Plastídeos/metabolismo , Matrizes de Pontuação de Posição Específica , Proteoma/química , Proteoma/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo , Software
19.
J Alzheimers Dis ; 34(2): 439-47, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23241556

RESUMO

Our previous works showed that single nucleotide polymorphisms (SNPs) in genes with regulatory function upon inflammatory response and cholesterol metabolism were associated with Alzheimer's disease (AD) risk. The list comprises SNPs located on the promoters of alpha 1 antichymotrypsin (rs1884082), hydroxy methyl glutaryl coenzime A reductase (rs376140), tumor necrosis factor alpha (rs1800629), and interleukin 10 (rs1800869). Here we investigated the effect of these SNPs on the binding for transcription factors. We computationally detected putative binding sites for transcription factors located in the SNP regions. To this aim, the TESS program for scanning the promoter sequences against the binding-site models available at TRANSFACT and JASPAR databases was adopted. All the analyzed SNPs appeared to affect the binding of myeloid zinc finger protein 1 (MZF-1) to the promoter sequence of the above reported genes. Therefore 16 SNPs in MZF-1 gene were tested in 120 AD cases and 88 controls to asses a possible association between MZF-1 and AD. 14 SNPs showed no variability in AD and control populations, while two SNPs rs4756 and rs2228162 showed the three genotypes. Genotype distributions and allele frequencies of these two SNPs were comparable between AD and controls. On the other hand, the haplotype distribution of rs4756 and rs2228162 was different between AD and controls; being the AG haplotype associated with a decreased AD risk. In conclusion, selected SNPs in MZF-1 gene exert a minor effect on AD risk.


Assuntos
Doença de Alzheimer/genética , Éxons/genética , Haplótipos/genética , Fatores de Transcrição Kruppel-Like/genética , Polimorfismo de Nucleotídeo Único/genética , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/metabolismo , Sequência de Aminoácidos , Sítios de Ligação/fisiologia , Feminino , Humanos , Fatores de Transcrição Kruppel-Like/metabolismo , Masculino , Dados de Sequência Molecular , Fatores de Risco
20.
BMC Genomics ; 13 Suppl 4: S8, 2012 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-22759656

RESUMO

BACKGROUND: Various computational methods are presently available to classify whether a protein variation is disease-associated or not. However data derived from recent technological advancements make it feasible to extend the annotation of disease-associated variations in order to include specific phenotypes. Here we tackle the problem of distinguishing between genetic variations associated to cancer and variations associated to other genetic diseases. RESULTS: We implement a new method based on Support Vector Machines that takes as input the protein variant and the protein function, as described by its associated Gene Ontology terms. Our approach succeeds in discriminating between germline variants that are likely to be cancer-associated from those that are related to other genetic disorders. The method performs with values of 90% accuracy and 0.61 Matthews correlation coefficient on a set comprising 6478 germline variations (16% are cancer-associated) in 592 proteins. The sensitivity and the specificity on the cancer class are 69% and 66%, respectively. Furthermore the method is capable of correctly excluding some 96% of 3392 somatic cancer-associated variations in 1983 proteins not included in the training/testing set. CONCLUSIONS: Here we prove feasible that a large set of cancer associated germline protein variations can be successfully discriminated from those associated to other genetic disorders. This is a step further in the process of protein variant annotation. Scoring largely improves when protein function as encoded by Gene Ontology terms is considered, corroborating the role of protein function as a key feature for a correct annotation of its variations.


Assuntos
Biologia Computacional/métodos , Neoplasias/genética , Proteínas/genética , Algoritmos , Predisposição Genética para Doença , Mutação em Linhagem Germinativa/genética , Humanos , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA