Your browser doesn't support javascript.
loading
Predicting S-nitrosylation proteins and sites by fusing multiple features.
Qiu, Wang-Ren; Wang, Qian-Kun; Guan, Meng-Yue; Jia, Jian-Hua; Xiao, Xuan.
Afiliação
  • Qiu WR; School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China.
  • Wang QK; School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China.
  • Guan MY; School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China.
  • Jia JH; School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China.
  • Xiao X; School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China.
Math Biosci Eng ; 18(6): 9132-9147, 2021 10 25.
Article em En | MEDLINE | ID: mdl-34814339
ABSTRACT
Protein S-nitrosylation is one of the most important post-translational modifications, a well-grounded understanding of S-nitrosylation is very significant since it plays a key role in a variety of biological processes. For an uncharacterized protein sequence, it is a very meaningful problem for both basic research and drug development when we can firstly identify whether it is a S-nitrosylation protein or not, and then predict the specific S-nitrosylation site(s). This work has proposed two models for identifying S-nitrosylation protein and its PTM sites. Firstly, three kinds of features are extracted from protein sequence KNN scoring of functional domain annotation, PseAAC and bag-of-words based on the physical and chemical properties of amino acids. Secondly, the synthetic minority oversampling technique is used to balance the data sets, and some state-of-the-art classifiers and feature fusion strategies are performed on the balanced data sets. In the five-fold cross-validation for predicting S-nitrosylation proteins, the results of Accuracy (ACC), Matthew's correlation coefficient (MCC) and area under ROC curve (AUC) are 81.84%, 0.5178, 0.8635, respectively. Finally, a model for predicting S-nitrosylation sites has been constructed on the basis of tripeptide composition (TPC) and the composition of k-spaced amino acid pairs (CKSAAP). To eliminate redundant information and improve work efficiency, elastic nets are employed for feature selection. The five-fold cross-validation tests have indicated the promising success rates of the proposed model. For the convenience of related researchers, the web-server named "RF-SNOPS" has been established at http//www.jci-bioinfo.cn/RF-SNOPS.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Aminoácidos Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Aminoácidos Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article