Pesquisa | Secretaria de Estado da Saúde

Interpreting forces as deep learning gradients improves quality of predicted protein structures.

King, Jonathan Edward; Koes, David Ryan.

Biophys J ; 123(17): 2730-2739, 2024 Sep 03.

Artigo em Inglês | MEDLINE | ID: mdl-38104241

RESUMO

Protein structure predictions from deep learning models like AlphaFold2, despite their remarkable accuracy, are likely insufficient for direct use in downstream tasks like molecular docking. The functionality of such models could be improved with a combination of increased accuracy and physical intuition. We propose a new method to train deep learning protein structure prediction models using molecular dynamics force fields to work toward these goals. Our custom PyTorch loss function, OpenMM-Loss, represents the potential energy of a predicted structure. OpenMM-Loss can be applied to any all-atom representation of a protein structure capable of mapping into our software package, SidechainNet. We demonstrate our method's efficacy by finetuning OpenFold. We show that subsequently predicted protein structures, both before and after a relaxation procedure, exhibit comparable accuracy while displaying lower potential energy and improved structural quality as assessed by MolProbity metrics.

Assuntos

Aprendizado Profundo , Conformação Proteica , Proteínas , Proteínas/química , Simulação de Dinâmica Molecular

SidechainNet: An all-atom protein structure dataset for machine learning.

King, Jonathan Edward; Koes, David Ryan.

Proteins ; 89(11): 1489-1496, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-34213059

RESUMO

Despite recent advancements in deep learning methods for protein structure prediction and representation, little focus has been directed at the simultaneous inclusion and prediction of protein backbone and sidechain structure information. We present SidechainNet, a new dataset that directly extends the ProteinNet dataset. SidechainNet includes angle and atomic coordinate information capable of describing all heavy atoms of each protein structure and can be extended by users to include new protein structures as they are released. In this article, we provide background information on the availability of protein structure data and the significance of ProteinNet. Thereafter, we argue for the potentially beneficial inclusion of sidechain information through SidechainNet, describe the process by which we organize SidechainNet, and provide a software package (https://github.com/jonathanking/sidechainnet) for data manipulation and training with machine learning models.

Assuntos

Aminoácidos/química , Aprendizado de Máquina , Proteínas/química , Software , Sequência de Aminoácidos , Conjuntos de Dados como Assunto , Redes Neurais de Computação , Conformação Proteica

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa