Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Clustering of multi-domain protein sequences.

Mehrotra, Prachi; Ami, Vimla Kany G; Srinivasan, Narayanaswamy.

Proteins ; 86(7): 759-776, 2018 07.

Artigo em Inglês | MEDLINE | ID: mdl-29675880

RESUMO

The overall function of a multi-domain protein is determined by the functional and structural interplay of its constituent domains. Traditional sequence alignment-based methods commonly utilize domain-level information and provide classification only at the level of domains. Such methods are not capable of taking into account the contributions of other domains in the proteins, and domain-linker regions and classify multi-domain proteins. An alignment-free protein sequence comparison tool, CLAP (CLAssification of Proteins) was previously developed in our laboratory to especially handle multi-domain protein sequences without a requirement of defining domain boundaries and sequential order of domains. Through this method we aim to achieve a biologically meaningful classification scheme for multi-domain protein sequences. In this article, CLAP-based classification has been explored on 5 datasets of multi-domain proteins and we present detailed analysis for proteins containing (1) Tyrosine phosphatase and (2) SH3 domain. At the domain-level CLAP-based classification scheme resulted in a clustering similar to that obtained from an alignment-based method. CLAP-based clusters obtained for full-length datasets were shown to comprise of proteins with similar functions and domain architectures. Our study demonstrates that multi-domain proteins could be classified effectively by considering full-length sequences without a requirement of identification of domains in the sequence.

Assuntos

Proteínas Tirosina Fosfatases/química , Domínios de Homologia de src , Análise por Conglomerados , Conformação Proteica , Análise de Sequência de Proteína

Comparison of Leptospira interrogans and Leptospira biflexa genomes: analysis of potential leptospiral-host interactions.

Mehrotra, Prachi; Ramakrishnan, Gayatri; Dhandapani, Gunasekaran; Srinivasan, Narayanaswamy; Madanan, Madathiparambil G.

Mol Biosyst ; 13(5): 883-891, 2017 May 02.

Artigo em Inglês | MEDLINE | ID: mdl-28294222

RESUMO

Leptospirosis, a potentially life-threatening disease, remains the most widespread zoonosis caused by pathogenic species of Leptospira. The pathogenic spirochaete, Leptospira interrogans, is characterized by its ability to permeate human host tissues rapidly and colonize multiple organs in the host. In spite of the efforts taken to comprehend the pathophysiology of the pathogen and the heterogeneity posed by L. interrogans, the current knowledge on the mechanism of pathogenesis is modest. In an attempt to contribute towards the same, we demonstrate the use of an established structure-based protocol coupled with information on subcellular localization of proteins and their tissue-specificity, in recognizing a set of 49 biologically feasible interactions potentially mediated by proteins of L. interrogans in humans. We have also presented means to adjudge the physicochemical viability of the predicted host-pathogen interactions, for selected cases, in terms of interaction energies and geometric shape complementarity of the interacting proteins. Comparative analyses of proteins of L. interrogans and the saprophytic spirochaete, Leptospira biflexa, and their predicted involvement in interactions with human hosts, aided in underpinning the functional relevance of leptospiral-host protein-protein interactions specific to L. interrogans as well as those specific to L. biflexa. Our study presents characteristics of the pathogenic L. interrogans that are predicted to facilitate its ability to persist in human hosts.

Assuntos

Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Leptospira/fisiologia , Leptospirose/metabolismo , Biologia Computacional/métodos , Genoma Bacteriano , Interações Hospedeiro-Patógeno , Humanos , Leptospira/classificação , Leptospira/genética , Leptospira/metabolismo , Leptospirose/microbiologia , Modelos Moleculares , Especificidade de Órgãos , Ligação Proteica , Conformação Proteica , Mapeamento de Interação de Proteínas

CLAP: a web-server for automatic classification of proteins with special reference to multi-domain proteins.

Gnanavel, Mutharasu; Mehrotra, Prachi; Rakshambikai, Ramaswamy; Martin, Juliette; Srinivasan, Narayanaswamy; Bhaskara, Ramachandra M.

BMC Bioinformatics ; 15: 343, 2014 Oct 04.

Artigo em Inglês | MEDLINE | ID: mdl-25282152

RESUMO

BACKGROUND: The function of a protein can be deciphered with higher accuracy from its structure than from its amino acid sequence. Due to the huge gap in the available protein sequence and structural space, tools that can generate functionally homogeneous clusters using only the sequence information, hold great importance. For this, traditional alignment-based tools work well in most cases and clustering is performed on the basis of sequence similarity. But, in the case of multi-domain proteins, the alignment quality might be poor due to varied lengths of the proteins, domain shuffling or circular permutations. Multi-domain proteins are ubiquitous in nature, hence alignment-free tools, which overcome the shortcomings of alignment-based protein comparison methods, are required. Further, existing tools classify proteins using only domain-level information and hence miss out on the information encoded in the tethered regions or accessory domains. Our method, on the other hand, takes into account the full-length sequence of a protein, consolidating the complete sequence information to understand a given protein better. RESULTS: Our web-server, CLAP (Classification of Proteins), is one such alignment-free software for automatic classification of protein sequences. It utilizes a pattern-matching algorithm that assigns local matching scores (LMS) to residues that are a part of the matched patterns between two sequences being compared. CLAP works on full-length sequences and does not require prior domain definitions.Pilot studies undertaken previously on protein kinases and immunoglobulins have shown that CLAP yields clusters, which have high functional and domain architectural similarity. Moreover, parsing at a statistically determined cut-off resulted in clusters that corroborated with the sub-family level classification of that particular domain family. CONCLUSIONS: CLAP is a useful protein-clustering tool, independent of domain assignment, domain order, sequence length and domain diversity. Our method can be used for any set of protein sequences, yielding functionally relevant clusters with high domain architectural homogeneity. The CLAP web server is freely available for academic use at http://nslab.mbu.iisc.ernet.in/clap/.

Assuntos

Biologia Computacional/métodos , Internet , Proteínas/química , Proteínas/classificação , Software , Algoritmos , Sequência de Aminoácidos , Automação , Análise por Conglomerados , Humanos , Estrutura Terciária de Proteína

The relationship between classification of multi-domain proteins using an alignment-free approach and their functions: a case study with immunoglobulins.

Bhaskara, Ramachandra M; Mehrotra, Prachi; Rakshambikai, Ramaswamy; Gnanavel, Mutharasu; Martin, Juliette; Srinivasan, Narayanaswamy.

Mol Biosyst ; 10(5): 1082-93, 2014 May.

Artigo em Inglês | MEDLINE | ID: mdl-24572770

RESUMO

Establishing functional relationships between multi-domain protein sequences is a non-trivial task. Traditionally, delineating functional assignment and relationships of proteins requires domain assignments as a prerequisite. This process is sensitive to alignment quality and domain definitions. In multi-domain proteins due to multiple reasons, the quality of alignments is poor. We report the correspondence between the classification of proteins represented as full-length gene products and their functions. Our approach differs fundamentally from traditional methods in not performing the classification at the level of domains. Our method is based on an alignment free local matching scores (LMS) computation at the amino-acid sequence level followed by hierarchical clustering. As there are no gold standards for full-length protein sequence classification, we resorted to Gene Ontology and domain-architecture based similarity measures to assess our classification. The final clusters obtained using LMS show high functional and domain architectural similarities. Comparison of the current method with alignment based approaches at both domain and full-length protein showed superiority of the LMS scores. Using this method we have recreated objective relationships among different protein kinase sub-families and also classified immunoglobulin containing proteins where sub-family definitions do not exist currently. This method can be applied to any set of protein sequences and hence will be instrumental in analysis of large numbers of full-length protein sequences.

Assuntos

Imunoglobulinas/química , Alinhamento de Sequência/métodos , Animais , Análise por Conglomerados , Bases de Dados de Proteínas , Humanos , Proteínas Quinases/química , Estrutura Terciária de Proteína

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA