Prediction of lipases types by different scale pseudo-amino acid composition / 生物工程学报
Chinese Journal of Biotechnology
; (12): 1968-1974, 2008.
Article
in Zh
| WPRIM
| ID: wpr-302883
Responsible library:
WPRO
ABSTRACT
Lipases are widely used enzymes in biotechnology. Although they catalyze the same reaction, their sequences vary. Therefore, it is highly desired to develop a fast and reliable method to identify the types of lipases according to their sequences, or even just to confirm whether they are lipases or not. By proposing two scales based pseudo amino acid composition approaches to extract the features of the sequences, a powerful predictor based on k-nearest neighbor was introduced to address the problems. The overall success rates thus obtained by the 10-fold cross-validation test were shown as below: for predicting lipases and nonlipase, the success rates were 92.8%, 91.4% and 91.3%, respectively. For lipase types, the success rates were 92.3%, 90.3% and 89.7%, respectively. Among them, the Z scales based pseudo amino acid composition was the best, T scales was the second. They outperformed significantly than 6 other frequently used sequence feature extraction methods. The high success rates yielded for such a stringent dataset indicate predicting the types of lipases is feasible and the different scales pseudo amino acid composition might be a useful tool for extracting the features of protein sequences, or at lease can play a complementary role to many of the other existing approaches.
Full text:
1
Index:
WPRIM
Main subject:
Chemistry
/
Classification
/
Computational Biology
/
Sequence Analysis, Protein
/
Amino Acids
/
Lipase
/
Methods
/
Models, Chemical
Type of study:
Prognostic_studies
Language:
Zh
Journal:
Chinese Journal of Biotechnology
Year:
2008
Type:
Article