Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
J Theor Biol ; 437: 239-250, 2018 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-29100918

RESUMO

Predicting protein subcellular location with support vector machine has been a popular research area recently because of the dramatic explosion of bioinformation. Though substantial achievements have been obtained, few researchers considered the problem of data imbalance before classification, which will lead to low accuracy for some categories. So in this work, we combined oversampling method with SVM to deal with the protein subcellular localization of unbalanced data sets. To capture valuable information of a protein, a PseAAC (Pseudo Amino Acid Composition) has been extracted from PSSM(Position-Specific Scoring Matrix) as a feature vector, and then be selected by principal component analysis (PCA). Next, samples which are treated by oversampling method to eliminate the imbalance of sample numbers in different classes are fed into support vector machine to predict the protein subcellular location. To evaluate the performance of proposed method, Jackknife tests are performed on three benchmark datasets (ZD98, CL317 and ZW225). Results of SVM experiments with and without oversampling gained by Jackknife tests show that oversampling methods have successfully decrease the imbalance of data sets, and the prediction accuracy of each class in each dataset is higher than 88.9%. With comparison with other protein subcellular localization methods, the method in this work reaches the best performance. The overall accuracies of ZD98, CL317 and ZW225 are 93.2%, 96.00% and 92.15% respectively, which are 2.4%, 8.0% and 8.2% higher than the best methods in the comparison. The excellent overall accuracy gained by the proposed method indicates that the feature representation and selection capture useful information of protein sequence and oversampling methods successfully solve the imbalance of sample numbers in SVM classification.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/metabolismo , Máquina de Vetores de Suporte , Algoritmos , Aminoácidos/química , Aminoácidos/metabolismo , Matrizes de Pontuação de Posição Específica , Análise de Componente Principal , Transporte Proteico , Proteínas/química
2.
Mol Biol Rep ; 45(6): 2295-2306, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30238411

RESUMO

For many biological functions membrane proteins (MPs) are considered crucial. Due to this nature of MPs, many pharmaceutical agents have reflected them as attractive targets. It bears indispensable importance that MPs are predicted with accurate measures using effective and efficient computational models (CMs). Annotation of MPs using in vitro analytical techniques is time-consuming and expensive; and in some cases, it can prove to be intractable. Due to this scenario, automated prediction and annotation of MPs through CM based techniques have appeared to be useful. Based on the use of computational intelligence and statistical moments based feature set, an MP prediction framework is proposed. Furthermore, the previously used dataset has been enhanced by incorporating new MPs from the latest release of UniProtKB. Rigorous experimentation proves that the use of statistical moments with a multilayer neural network, trained using back-propagation based prediction techniques allows more thorough results.


Assuntos
Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Análise de Sequência de Proteína/métodos , Algoritmos , Aminoácidos , Animais , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Proteínas , Humanos , Proteínas de Membrana/fisiologia
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa