Feature extraction by statistical contact potentials and wavelet transform for predicting subcellular localizations in gram negative bacterial proteins.
J Theor Biol
; 364: 121-30, 2015 Jan 07.
Article
en En
| MEDLINE
| ID: mdl-25219623
Predicting the localization of a protein has become a useful practice for inferring its function. Most of the reported methods to predict subcellular localizations in Gram-negative bacterial proteins make use of standard protein representations that generally do not take into account the distribution of the amino acids and the structural information of the proteins. Here, we propose a protein representation based on the structural information contained in the pairwise statistical contact potentials. The wavelet transform decodes the information contained in the primary structure of the proteins, allowing the identification of patterns along the proteins, which are used to characterize the subcellular localizations. Then, a support vector machine classifier is trained to categorize them. Cellular compartments like periplasm and extracellular medium are difficult to predict, having a high false negative rate. The wavelet-based method achieves an overall high performance while maintaining a low false negative rate, particularly, on "periplasm" and "extracellular medium". Our results suggest the proposed protein characterization is a useful alternative to representing and predicting protein sequences over the classical and cutting edge protein depictions.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Proteínas Bacterianas
/
Algoritmos
/
Estadística como Asunto
/
Análisis de Ondículas
/
Bacterias Gramnegativas
Tipo de estudio:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Revista:
J Theor Biol
Año:
2015
Tipo del documento:
Article
Pais de publicación:
Reino Unido