Your browser doesn't support javascript.
loading
Comprehensive Study on Enhancing Low-Quality Position-Specific Scoring Matrix with Deep Learning for Accurate Protein Structure Property Prediction: Using Bagging Multiple Sequence Alignment Learning.
Guo, Yuzhi; Wu, Jiaxiang; Ma, Hehuan; Wang, Sheng; Huang, Junzhou.
Afiliação
  • Guo Y; Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas, USA.
  • Wu J; Tencent AI Lab, Shenzhen, China.
  • Ma H; Tencent AI Lab, Shenzhen, China.
  • Wang S; Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas, USA.
  • Huang J; Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas, USA.
J Comput Biol ; 28(4): 346-361, 2021 04.
Article em En | MEDLINE | ID: mdl-33617347
ABSTRACT
Accurate predictions of protein structure properties, for example, secondary structure and solvent accessibility, are essential in analyzing the structure and function of a protein. Position-specific scoring matrix (PSSM) features are widely used in the structure property prediction. However, some proteins may have low-quality PSSM features due to insufficient homologous sequences, leading to limited prediction accuracy. To address this limitation, we propose an enhancing scheme for PSSM features. We introduce the "Bagging MSA" (multiple sequence alignment) method to calculate PSSM features used to train our model, adopt a convolutional network to capture local context features and bidirectional long short-term memory for long-term dependencies, and integrate them under an unsupervised framework. Structure property prediction models are then built upon such enhanced PSSM features for more accurate predictions. Moreover, we develop two frameworks to evaluate the effectiveness of the enhanced PSSM features, which also bring proposed method into real-world scenarios. Empirical evaluation of CB513, CASP11, and CASP12 data sets indicates that our unsupervised enhancing scheme indeed generates more informative PSSM features for structure property prediction.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Conformação Proteica / Proteínas / Biologia Computacional / Aprendizado Profundo Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Conformação Proteica / Proteínas / Biologia Computacional / Aprendizado Profundo Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article