Your browser doesn't support javascript.
loading
Learning Proteome Domain Folding Using LSTMs in an Empirical Kernel Space.
Kuang, Da; Issakova, Dina; Kim, Junhyong.
Afiliação
  • Kuang D; Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA. Electronic address: kuangda@seas.upenn.edu.
  • Issakova D; Department of Biology, University of Pennsylvania, Philadelphia, USA. Electronic address: dissakov@sas.upenn.edu.
  • Kim J; Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA; Department of Biology, University of Pennsylvania, Philadelphia, USA. Electronic address: junhyong@sas.upenn.edu.
J Mol Biol ; 434(15): 167686, 2022 08 15.
Article em En | MEDLINE | ID: mdl-35716781
ABSTRACT
The recognition of protein structural folds is the starting point for protein function inference and for many structural prediction tools. We previously introduced the idea of using empirical comparisons to create a data-augmented feature space called PESS (Protein Empirical Structure Space)1 as a novel approach for protein structure prediction. Here, we extend the previous approach by generating the PESS feature space over fixed-length subsequences of query peptides, and applying a sequential neural network model, with one long short-term memory cell layer followed by a fully connected layer. Using this approach, we show that only a small group of domains as a training set is needed to achieve near state-of-the-art accuracy on fold recognition. Our method improves on the previous approach by reducing the training set required and improving the model's ability to generalize across species, which will help fold prediction for newly discovered proteins.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Redes Neurais de Computação / Dobramento de Proteína / Proteoma Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Redes Neurais de Computação / Dobramento de Proteína / Proteoma Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2022 Tipo de documento: Article