Learning Proteome Domain Folding Using LSTMs in an Empirical Kernel Space.
J Mol Biol
; 434(15): 167686, 2022 08 15.
Article
en En
| MEDLINE
| ID: mdl-35716781
The recognition of protein structural folds is the starting point for protein function inference and for many structural prediction tools. We previously introduced the idea of using empirical comparisons to create a data-augmented feature space called PESS (Protein Empirical Structure Space)1 as a novel approach for protein structure prediction. Here, we extend the previous approach by generating the PESS feature space over fixed-length subsequences of query peptides, and applying a sequential neural network model, with one long short-term memory cell layer followed by a fully connected layer. Using this approach, we show that only a small group of domains as a training set is needed to achieve near state-of-the-art accuracy on fold recognition. Our method improves on the previous approach by reducing the training set required and improving the model's ability to generalize across species, which will help fold prediction for newly discovered proteins.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Redes Neurales de la Computación
/
Pliegue de Proteína
/
Proteoma
Tipo de estudio:
Prognostic_studies
Idioma:
En
Revista:
J Mol Biol
Año:
2022
Tipo del documento:
Article
Pais de publicación:
Países Bajos