Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 21(1): 242, 2020 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-32532211

RESUMO

BACKGROUND: This study is motivated by the following three considerations: a) the physico-chemical properties of transmembrane (TM) proteins are distinctly different from those of globular proteins, necessitating the development of specialized structure prediction techniques, b) for many structural features no specialized predictors for TM proteins are available at all, and c) deep learning algorithms allow to automate the feature engineering process and thus facilitate the development of multi-target methods for predicting several protein properties at once. RESULTS: We present AllesTM, an integrated tool to predict almost all structural features of transmembrane proteins that can be extracted from atomic coordinate data. It blends several machine learning algorithms: random forests and gradient boosting machines, convolutional neural networks in their original form as well as those enhanced by dilated convolutions and residual connections, and, finally, long short-term memory architectures. AllesTM outperforms other available methods in predicting residue depth in the membrane, flexibility, topology, relative solvent accessibility in its bound state, while in torsion angles, secondary structure and monomer relative solvent accessibility prediction it lags only slightly behind the currently leading technique SPOT-1D. High accuracy on a multitude of prediction targets and easy installation make AllesTM a one-stop shop for many typical problems in the structural bioinformatics of transmembrane proteins. CONCLUSIONS: In addition to presenting a highly accurate prediction method and eliminating the need to install and maintain many different software tools, we also provide a comprehensive overview of the impact of different machine learning algorithms and parameter choices on the prediction performance. AllesTM is freely available at https://github.com/phngs/allestm.


Assuntos
Alelos , Biologia Computacional/métodos , Proteínas de Membrana/química
2.
J Struct Biol ; 206(2): 156-169, 2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-30836197

RESUMO

Many integral membrane proteins, just like their globular counterparts, form either transient or permanent multi-subunit complexes to fulfill specific cellular roles. Although numerous interactions between these proteins have been experientially determined, the structural coverage of the complexes is very low. Therefore, the computational identification of the amino acid residues involved in the interaction interfaces is a crucial step towards the functional annotation of all membrane proteins.Here, we present MBPred, a sequence-based method for predicting the interface residues in transmembrane proteins. An unique feature of our method is that it contains separate random forest models for two different use cases: (a) when the location of transmembrane regions is precisely known from a crystal structure, and (b) when it is predicted from sequence. In stark contrast to the aqueous-exposed protein segments, we found that the interaction sites located in the membrane are not enriched for evolutionary conservation, most likely due to their restricted amino acid composition or their random distribution among buried and exposed residues. On the other hand, residue co-evolution proved to be a very informative feature which has not so far been used for predicting interaction sites in individual proteins. MBPred reaches AUC, precision and recall values of 0.79/0.73, 0.69/0.51 and 0.55/0.48 on the cross-validation and independent test dataset, respectively, thus outperforming the previously published method of Bordner as well as all methods trained on globular proteins. Moreover, we show that for the majority of complete interface patches, the method captures more than 50% of the involved residues.


Assuntos
Evolução Biológica , Proteínas de Membrana/metabolismo , Algoritmos , Sítios de Ligação , Biologia Computacional , Bases de Dados de Proteínas , Ligação Proteica
3.
Genome Biol Evol ; 10(3): 928-938, 2018 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-29608732

RESUMO

Can orthologous proteins differ in terms of their ability to be secreted? To answer this question, we investigated the distribution of signal peptides within the orthologous groups of Enterobacterales. Parsimony analysis and sequence comparisons revealed a large number of signal peptide gain and loss events, in which signal peptides emerge or disappear in the course of evolution. Signal peptide losses prevail over gains, an effect which is especially pronounced in the transition from the free-living or commensal to the endosymbiotic lifestyle. The disproportionate decline in the number of signal peptide-containing proteins in endosymbionts cannot be explained by the overall reduction of their genomes. Signal peptides can be gained and lost either by acquisition/elimination of the corresponding N-terminal regions or by gradual accumulation of mutations. The evolutionary dynamics of signal peptides in bacterial proteins represents a powerful mechanism of functional diversification.


Assuntos
Evolução Molecular , Filogenia , Sinais Direcionadores de Proteínas/genética , Simbiose/genética , Enterobacteriaceae/genética , Genoma Bacteriano/genética
4.
Bioinformatics ; 34(13): 2325-2326, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29401218

RESUMO

Motivation: Existing sources of experimental mutation data do not consider the structural environment of amino acid substitutions and distinguish between soluble and membrane proteins. They also suffer from a number of further limitations, including data redundancy, lack of disease classification, incompatible information content, and ambiguous annotations (e.g. the same mutation being annotated as disease and benign). Results: We have developed a novel database, MutHTP, which contains information on 183 395 disease-associated and 17 827 neutral mutations in human transmembrane proteins. For each mutation site MutHTP provides a description of its location with respect to the membrane protein topology, structural environment (if available) and functional features. Comprehensive visualization, search, display and download options are available. Availability and implementation: The database is publicly available at http://www.iitm.ac.in/bioinfo/MutHTP/. The website is implemented using HTML, PHP and javascript and supports recent versions of all major browsers, such as Firefox, Chrome and Opera. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Membrana/genética , Mutação , Software , Bases de Dados Factuais , Humanos
5.
J Struct Biol ; 194(1): 112-23, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26851352

RESUMO

Accurate prediction of intra-molecular interactions from amino acid sequence is an important pre-requisite for obtaining high-quality protein models. Over the recent years, remarkable progress in this area has been achieved through the application of novel co-variation algorithms, which eliminate transitive evolutionary connections between residues. In this work we present a new contact prediction method for α-helical transmembrane proteins, MemConP, in which evolutionary couplings are combined with a machine learning approach. MemConP achieves a substantially improved accuracy (precision: 56.0%, recall: 17.5%, MCC: 0.288) compared to the use of either machine learning or co-evolution methods alone. The method also achieves 91.4% precision, 42.1% recall and a MCC of 0.490 in predicting helix-helix interactions based on predicted contacts. The approach was trained and rigorously benchmarked by cross-validation and independent testing on up-to-date non-redundant datasets of 90 and 30 experimental three dimensional structures, respectively. MemConP is a standalone tool that can be downloaded together with the associated training data from http://webclu.bio.wzw.tum.de/MemConP.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas de Membrana/química , Domínios Proteicos , Estrutura Secundária de Proteína , Aminoácidos/química , Aminoácidos/metabolismo , Sítios de Ligação , Internet , Proteínas de Membrana/metabolismo , Modelos Moleculares , Reprodutibilidade dos Testes
6.
Nucleic Acids Res ; 42(Web Server issue): W337-43, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24799431

RESUMO

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.


Assuntos
Conformação Proteica , Software , Substituição de Aminoácidos , Sítios de Ligação , Ontologia Genética , Internet , Proteínas Intrinsicamente Desordenadas/química , Proteínas de Membrana/química , Mutação , Mapeamento de Interação de Proteínas , Proteínas/análise , Proteínas/genética , Proteínas/metabolismo , Alinhamento de Sequência , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos
7.
Nucleic Acids Res ; 41(Web Server issue): W459-64, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23729472

RESUMO

Regulated intramembrane proteolysis (RIP) is a critical mechanism for intercellular communication and regulates the function of membrane proteins through sequential proteolysis. RIP typically starts with ectodomain shedding of membrane proteins by extracellular membrane-bound proteases followed by intramembrane proteolysis of the resulting membrane-tethered fragment. However, for the majority of RIP proteases the corresponding substrates and thus, their functions, remain unknown. Proteome-wide identification of RIP protease substrates is possible by mass spectrometry-based quantitative comparison of RIP substrates or their cleavage products between different biological states. However, this requires quantification of peptides from only the ectodomain or cytoplasmic domain. Current analysis software does not allow matching peptides to either domain. Here we present the QARIP (Quantitative Analysis of Regulated Intramembrane Proteolysis) web server which matches identified peptides to the protein transmembrane topology. QARIP allows determination of quantitative ratios separately for the topological domains (cytoplasmic, ectodomain) of a given protein and is thus a powerful tool for quality control, improvement of quantitative ratios and identification of novel substrates in proteomic RIP datasets. To our knowledge, the QARIP web server is the first tool directly addressing the phenomenon of RIP. The web server is available at http://webclu.bio.wzw.tum.de/qarip/. This website is free and open to all users and there is no login requirement.


Assuntos
Proteínas de Membrana/metabolismo , Software , Ácido Aspártico Endopeptidases/metabolismo , Células HEK293 , Humanos , Internet , Espectrometria de Massas , Proteínas de Membrana/química , Peptídeos/análise , Estrutura Terciária de Proteína , Proteólise , Proteômica
8.
BMC Bioinformatics ; 14 Suppl 3: S7, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23514582

RESUMO

BACKGROUND: Any method that de novo predicts protein function should do better than random. More challenging, it also ought to outperform simple homology-based inference. METHODS: Here, we describe a few methods that predict protein function exclusively through homology. Together, they set the bar or lower limit for future improvements. RESULTS AND CONCLUSIONS: During the development of these methods, we faced two surprises. Firstly, our most successful implementation for the baseline ranked very high at CAFA1. In fact, our best combination of homology-based methods fared only slightly worse than the top-of-the-line prediction method from the Jones group. Secondly, although the concept of homology-based inference is simple, this work revealed that the precise details of the implementation are crucial: not only did the methods span from top to bottom performers at CAFA, but also the reasons for these differences were unexpected. In this work, we also propose a new rigorous measure to compare predicted and experimental annotations. It puts more emphasis on the details of protein function than the other measures employed by CAFA and may best reflect the expectations of users. Clearly, the definition of proper goals remains one major objective for CAFA.


Assuntos
Proteínas/fisiologia , Homologia de Sequência de Aminoácidos , Algoritmos , Proteínas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...