Your browser doesn't support javascript.
loading
Data Size and Quality Matter: Generating Physically-Realistic Distance Maps of Protein Tertiary Structures.
Alam, Fardina Fathmiul; Shehu, Amarda.
Afiliação
  • Alam FF; Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
  • Shehu A; Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
Biomolecules ; 12(7)2022 06 29.
Article em En | MEDLINE | ID: mdl-35883464
With the debut of AlphaFold2, we now can get a highly-accurate view of a reasonable equilibrium tertiary structure of a protein molecule. Yet, a single-structure view is insufficient and does not account for the high structural plasticity of protein molecules. Obtaining a multi-structure view of a protein molecule continues to be an outstanding challenge in computational structural biology. In tandem with methods formulated under the umbrella of stochastic optimization, we are now seeing rapid advances in the capabilities of methods based on deep learning. In recent work, we advance the capability of these models to learn from experimentally-available tertiary structures of protein molecules of varying lengths. In this work, we elucidate the important role of the composition of the training dataset on the neural network's ability to learn key local and distal patterns in tertiary structures. To make such patterns visible to the network, we utilize a contact map-based representation of protein tertiary structure. We show interesting relationships between data size, quality, and composition on the ability of latent variable models to learn key patterns of tertiary structure. In addition, we present a disentangled latent variable model which improves upon the state-of-the-art variable autoencoder-based model in key, physically-realistic structural patterns. We believe this work opens up further avenues of research on deep learning-based models for computing multi-structure views of protein molecules.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Biologia Computacional Tipo de estudo: Prognostic_studies Idioma: En Revista: Biomolecules Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Biologia Computacional Tipo de estudo: Prognostic_studies Idioma: En Revista: Biomolecules Ano de publicação: 2022 Tipo de documento: Article