Your browser doesn't support javascript.
loading
AlphaFold 2-based stacking model for protein solubility prediction and its transferability on seed storage proteins.
Kwon, Hyukjin; Du, Zhenjiao; Li, Yonghui.
Afiliação
  • Kwon H; Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA.
  • Du Z; Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA.
  • Li Y; Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA. Electronic address: yonghui@ksu.edu.
Int J Biol Macromol ; 278(Pt 1): 134601, 2024 Oct.
Article em En | MEDLINE | ID: mdl-39137857
ABSTRACT
Accurate protein solubility prediction is crucial in screening suitable candidates for food application. Existing models often rely only on sequences, overlooking important structural details. In this study, a regression model for protein solubility was developed using both the sequences and predicted structures of 2983 E. coli proteins. The sequence and structural level properties of the proteins were bioinformatically extracted and subjected to multilayer perceptron (MLP). Moreover, residue level features and contact maps were utilized to construct a graph convolutional network (GCN). The out-of-fold predictions of the two models were combined and fed into multiple meta-regressors to create a stacking model. The stacking model with support vector regressor (SVR) achieved R2 of 0.502 and 0.468 on test and external validation datasets, respectively, displaying higher performance compared to existing regression models. Based on the improved performance compared to its based models, the stacking model effectively captured the strength of its base models as well as the significance of the different features used. Furthermore, the model's transferability was indirectly validated on a dataset of seed storage proteins using Osborne definition as well as on a case study using molecular dynamic simulation, showing potential for application beyond microbial proteins to food and agriculture-related ones.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Solubilidade / Proteínas de Armazenamento de Sementes Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Solubilidade / Proteínas de Armazenamento de Sementes Idioma: En Ano de publicação: 2024 Tipo de documento: Article