Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Granular support vector machine to identify unknown structural classes of protein.

Hassan, Rohayanti; Othman, Razib M; Shah, Zuraini A.

Int J Data Min Bioinform ; 12(4): 451-67, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26510297

RESUMO

To date, classification of structural class using local protein structure rather than the whole structure has been gaining widespread attention. It is noted that the structural class lies in local composition or arrangement of secondary structure, while the threshold-based classification method has restricted rules in determining these structural classes. As a consequence, some of the structures are unknown. In order to determine these unknown structural classes, we propose a fusion algorithm, abbreviated as GSVM-SigLpsSCPred (Granular Support Vector Machine--with Significant Local protein structure for Structural Class Prediction), which consists of two major components, which are: optimal local protein structure to represent the feature vector and granular support vector machine to predict the unknown structural classes. The results highlight the performance of GSVM-SigLpsSCPred as an alternative computational method for low-identity sequences.

Assuntos

Algoritmos , Bases de Dados de Proteínas , Proteínas/classificação , Proteínas/genética , Análise de Sequência de Proteína/métodos , Máquina de Vetores de Suporte , Estrutura Secundária de Proteína

Utilizing shared interacting domain patterns and Gene Ontology information to improve protein-protein interaction prediction.

Roslan, Rosfuzah; Othman, Razib M; Shah, Zuraini A; Kasim, Shahreen; Asmuni, Hishammuddin; Taliba, Jumail; Hassan, Rohayanti; Zakaria, Zalmiyah.

Comput Biol Med ; 40(6): 555-64, 2010 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-20417930

RESUMO

Protein-protein interactions (PPIs) play a significant role in many crucial cellular operations such as metabolism, signaling and regulations. The computational methods for predicting PPIs have shown tremendous growth in recent years, but problem such as huge false positive rates has contributed to the lack of solid PPI information. We aimed at enhancing the overlap between computational predictions and experimental results in an effort to partially remove PPIs falsely predicted. The use of protein function predictor named PFP() that are based on shared interacting domain patterns is introduced in this study with the purpose of aiding the Gene Ontology Annotations (GOA). We used GOA and PFP() as agents in a filtering process to reduce false positive pairs in the computationally predicted PPI datasets. The functions predicted by PFP() were extracted from cross-species PPI data in order to assign novel functional annotations for the uncharacterized proteins and also as additional functions for those that are already characterized by the GO (Gene Ontology). The implementation of PFP() managed to increase the chances of finding matching function annotation for the first rule in the filtration process as much as 20%. To assess the capability of the proposed framework in filtering false PPIs, we applied it on the available S. cerevisiae PPIs and measured the performance in two aspects, the improvement made indicated as Signal-to-Noise Ratio (SNR) and the strength of improvement, respectively. The proposed filtering framework significantly achieved better performance than without it in both metrics.

Assuntos

Biologia Computacional/métodos , Modelos Estatísticos , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas/fisiologia , Algoritmos , Animais , Proteínas de Caenorhabditis elegans , Análise por Conglomerados , Bases de Dados Genéticas , Proteínas de Drosophila , Humanos , Proteínas de Saccharomyces cerevisiae , Terminologia como Assunto

SPlitSSI-SVM: an algorithm to reduce the misleading and increase the strength of domain signal.

Kalsum, Hassan U; Shah, Zuraini A; Othman, Razib M; Hassan, Rohayanti; Rahim, Shafry M; Asmuni, Hishammudin; Taliba, Jumail; Zakaria, Zalmiyah.

Comput Biol Med ; 39(11): 1013-9, 2009 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-19720371

RESUMO

Protein domains contain information about the prediction of protein structure, function, evolution and design since the protein sequence may contain several domains with different or the same copies of the protein domain. In this study, we proposed an algorithm named SplitSSI-SVM that works with the following steps. First, the training and testing datasets are generated to test the SplitSSI-SVM. Second, the protein sequence is split into subsequence based on order and disorder regions. The protein sequence that is more than 600 residues is split into subsequences to investigate the effectiveness of the protein domain prediction based on subsequence. Third, multiple sequence alignment is performed to predict the secondary structure using bidirectional recurrent neural networks (BRNN) where BRNN considers the interaction between amino acids. The information of about protein secondary structure is used to increase the protein domain boundaries signal. Lastly, support vector machines (SVM) are used to classify the protein domain into single-domain, two-domain and multiple-domain. The SplitSSI-SVM is developed to reduce misleading signal, lower protein domain signal caused by primary structure of protein sequence and to provide accurate classification of the protein domain. The performance of SplitSSI-SVM is evaluated using sensitivity and specificity on single-domain, two-domain and multiple-domain. The evaluation shows that the SplitSSI-SVM achieved better results compared with other protein domain predictors such as DOMpro, GlobPlot, Dompred-DPS, Mateo, Biozon, Armadillo, KemaDom, SBASE, HMMPfam and HMMSMART especially in two-domain and multiple-domain.

Assuntos

Algoritmos , Modelos Teóricos , Alinhamento de Sequência

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA