Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 93
Filtrar
1.
Nat Commun ; 15(1): 4536, 2024 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-38806453

RESUMEN

Protein-ligand docking is an established tool in drug discovery and development to narrow down potential therapeutics for experimental testing. However, a high-quality protein structure is required and often the protein is treated as fully or partially rigid. Here we develop an AI system that can predict the fully flexible all-atom structure of protein-ligand complexes directly from sequence information. We find that classical docking methods are still superior, but depend upon having crystal structures of the target protein. In addition to predicting flexible all-atom structures, predicted confidence metrics (plDDT) can be used to select accurate predictions as well as to distinguish between strong and weak binders. The advances presented here suggest that the goal of AI-based drug discovery is one step closer, but there is still a way to go to grasp the complexity of protein-ligand interactions fully. Umol is available at: https://github.com/patrickbryant1/Umol .


Asunto(s)
Simulación del Acoplamiento Molecular , Proteínas , Ligandos , Proteínas/química , Proteínas/metabolismo , Unión Proteica , Descubrimiento de Drogas/métodos , Conformación Proteica , Programas Informáticos , Sitios de Unión
2.
Sci Rep ; 13(1): 18928, 2023 11 02.
Artículo en Inglés | MEDLINE | ID: mdl-37919373

RESUMEN

Protein palmitoylation, a cellular process occurring at the membrane-cytosol interface, is orchestrated by members of the DHHC enzyme family and plays a pivotal role in regulating various cellular functions. The M2 protein of the influenza virus, which is acylated at a membrane-near amphiphilic helix serves as a model for studying the intricate signals governing acylation and its interaction with the cognate enzyme, DHHC20. We investigate it here using both experimental and computational assays. We report that altering the biophysical properties of the amphiphilic helix, particularly by shortening or disrupting it, results in a substantial reduction in M2 palmitoylation, but does not entirely abolish the process. Intriguingly, DHHC20 exhibits an augmented affinity for some M2 mutants compared to the wildtype M2. Molecular dynamics simulations unveil interactions between amino acids of the helix and the catalytically significant DHHC and TTXE motifs of DHHC20. Our findings suggest that the binding of M2 to DHHC20, while not highly specific, is mediated by requisite contacts, possibly instigating the transfer of fatty acids. A comprehensive comprehension of protein palmitoylation mechanisms is imperative for the development of DHHC-specific inhibitors, holding promise for the treatment of diverse human diseases.


Asunto(s)
Virus de la Influenza A , Orthomyxoviridae , Humanos , Virus de la Influenza A/fisiología , Dominios Proteicos , Ácidos Grasos/metabolismo , Acilación
3.
Nat Commun ; 14(1): 5739, 2023 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-37714883

RESUMEN

A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.


Asunto(s)
Aprendizaje Automático , Física , Termodinámica , Proteínas de la Ataxia Telangiectasia Mutada , Simulación de Dinámica Molecular
4.
J Chem Theory Comput ; 19(18): 6151-6159, 2023 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-37688551

RESUMEN

Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields, and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several protein simulations for systems up to 56 amino acids, reproducing the CG equilibrium distribution and preserving the dynamics of all-atom simulations such as protein folding events.

5.
J Phys Chem B ; 127(31): 6920-6927, 2023 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-37499123

RESUMEN

Coarse-grained models allow computational investigation of biomolecular processes occurring on long time and length scales, intractable with atomistic simulation. Traditionally, many coarse-grained models rely mostly on pairwise interaction potentials. However, the decimation of degrees of freedom should, in principle, lead to a complex many-body effective interaction potential. In this work, we use experimental data on mutant stability to parametrize coarse-grained models for two proteins with and without many-body terms. We demonstrate that many-body terms are necessary to reproduce quantitatively the effects of point mutations on protein stability, particularly to implicitly take into account the effect of the solvent.


Asunto(s)
Proteínas , Simulación por Computador , Solventes
6.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37418278

RESUMEN

Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as representative conformational ensembles. Recent advances in computational methods have led to an increase in the number of available structural datasets spanning conformational landscapes. However, extracting representative conformational ensembles from such datasets is not an easy task and many methods have been developed to tackle it. Our new approach, EnGens (short for ensemble generation), collects these methods into a unified framework for generating and analyzing representative protein conformational ensembles. In this work, we: (1) provide an overview of existing methods and tools for representative protein structural ensemble generation and analysis; (2) unify existing approaches in an open-source Python package, and a portable Docker image, providing interactive visualizations within a Jupyter Notebook pipeline; (3) test our pipeline on a few canonical examples from the literature. Representative ensembles produced by EnGens can be used for many downstream tasks such as protein-ligand ensemble docking, Markov state modeling of protein dynamics and analysis of the effect of single-point mutations.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Conformación Proteica , Proteínas/química
7.
bioRxiv ; 2023 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-37163076

RESUMEN

Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as representative conformational ensembles. Recent advances in computational methods have led to an increase in number of available structural datasets spanning conformational landscapes. However, extracting representative conformational ensembles from such datasets is not an easy task and many methods have been developed to tackle it. Our new approach, EnGens (short for ensemble generation), collects these methods into a unified framework for generating and analyzing protein conformational ensembles. In this work we: (1) provide an overview of existing methods and tools for protein structural ensemble generation and analysis; (2) unify existing approaches in an open-source Python package, and a portable Docker image, providing interactive visualizations within a Jupyter Notebook pipeline; (3) test our pipeline on a few canonical examples found in the literature. Representative ensembles produced by EnGens can be used for many downstream tasks such as protein-ligand ensemble docking, Markov state modeling of protein dynamics and analysis of the effect of single-point mutations.

8.
J Phys Chem Lett ; 14(17): 3970-3979, 2023 May 04.
Artículo en Inglés | MEDLINE | ID: mdl-37079800

RESUMEN

Machine-learned coarse-grained (CG) models have the potential for simulating large molecular complexes beyond what is possible with atomistic molecular dynamics. However, training accurate CG models remains a challenge. A widely used methodology for learning bottom-up CG force fields maps forces from all-atom molecular dynamics to the CG representation and matches them with a CG force field on average. We show that there is flexibility in how to map all-atom forces to the CG representation and that the most commonly used mapping methods are statistically inefficient and potentially even incorrect in the presence of constraints in the all-atom simulation. We define an optimization statement for force mappings and demonstrate that substantially improved CG force fields can be learned from the same simulation data when using optimized force maps. The method is demonstrated on the miniproteins chignolin and tryptophan cage and published as open-source code.

9.
Curr Opin Struct Biol ; 79: 102533, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36731338

RESUMEN

The successful recent application of machine learning methods to scientific problems includes the learning of flexible and accurate atomic-level force-fields for materials and biomolecules from quantum chemical data. In parallel, the machine learning of force-fields at coarser resolutions is rapidly gaining relevance as an efficient way to represent the higher-body interactions needed in coarse-grained force-fields to compensate for the omitted degrees of freedom. Coarse-grained models are important for the study of systems at time and length scales exceeding those of atomistic simulations. However, the development of transferable coarse-grained models via machine learning still presents significant challenges. Here, we discuss recent developments in this field and current efforts to address the remaining challenges.


Asunto(s)
Aprendizaje Automático , Termodinámica
10.
ACS Cent Sci ; 9(2): 186-196, 2023 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-36844497

RESUMEN

The aim of molecular coarse-graining approaches is to recover relevant physical properties of the molecular system via a lower-resolution model that can be more efficiently simulated. Ideally, the lower resolution still accounts for the degrees of freedom necessary to recover the correct physical behavior. The selection of these degrees of freedom has often relied on the scientist's chemical and physical intuition. In this article, we make the argument that in soft matter contexts desirable coarse-grained models accurately reproduce the long-time dynamics of a system by correctly capturing the rare-event transitions. We propose a bottom-up coarse-graining scheme that correctly preserves the relevant slow degrees of freedom, and we test this idea for three systems of increasing complexity. We show that in contrast to this method existing coarse-graining schemes such as those from information theory or structure-based approaches are not able to recapitulate the slow time scales of the system.

11.
J Chem Theory Comput ; 19(3): 942-952, 2023 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-36668906

RESUMEN

Coarse-grained (CG) molecular simulations have become a standard tool to study molecular processes on time and length scales inaccessible to all-atom simulations. Parametrizing CG force fields to match all-atom simulations has mainly relied on force-matching or relative entropy minimization, which require many samples from costly simulations with all-atom or CG resolutions, respectively. Here we present flow-matching, a new training method for CG force fields that combines the advantages of both methods by leveraging normalizing flows, a generative deep learning method. Flow-matching first trains a normalizing flow to represent the CG probability density, which is equivalent to minimizing the relative entropy without requiring iterative CG simulations. Subsequently, the flow generates samples and forces according to the learned distribution in order to train the desired CG free energy model via force-matching. Even without requiring forces from the all-atom simulations, flow-matching outperforms classical force-matching by an order of magnitude in terms of data efficiency and produces CG models that can capture the folding and unfolding transitions of small proteins.

12.
J Phys Chem Lett ; 13(50): 11643-11649, 2022 Dec 22.
Artículo en Inglés | MEDLINE | ID: mdl-36484770

RESUMEN

We combine replica exchange (parallel tempering) with normalizing flows, a class of deep generative models. These two sampling strategies complement each other, resulting in an efficient method for sampling molecular systems characterized by rare events, which we call learned replica exchange (LREX). In LREX, a normalizing flow is trained to map the configurations of the fastest-mixing replica into configurations belonging to the target distribution, allowing direct exchanges between the two without the need to simulate intermediate replicas. This can significantly reduce the computational cost compared to standard replica exchange. The proposed method also offers several advantages with respect to Boltzmann generators that directly use normalizing flows to sample the target distribution. We apply LREX to some prototypical molecular dynamics systems, highlighting the improvements over previous methods.


Asunto(s)
Simulación de Dinámica Molecular
13.
Nat Commun ; 13(1): 7101, 2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36402768

RESUMEN

The increasing interest in modeling the dynamics of ever larger proteins has revealed a fundamental problem with models that describe the molecular system as being in a global configuration state. This notion limits our ability to gather sufficient statistics of state probabilities or state-to-state transitions because for large molecular systems the number of metastable states grows exponentially with size. In this manuscript, we approach this challenge by introducing a method that combines our recent progress on independent Markov decomposition (IMD) with VAMPnets, a deep learning approach to Markov modeling. We establish a training objective that quantifies how well a given decomposition of the molecular system into independent subdomains with Markovian dynamics approximates the overall dynamics. By constructing an end-to-end learning framework, the decomposition into such subdomains and their individual Markov state models are simultaneously learned, providing a data-efficient and easily interpretable summary of the complex system dynamics. While learning the dynamical coupling between Markovian subdomains is still an open issue, the present results are a significant step towards learning Ising models of large molecular complexes from simulation data.


Asunto(s)
Algoritmos , Aprendizaje Profundo , Cadenas de Markov , Sustancias Macromoleculares , Simulación por Computador
14.
J Chem Phys ; 157(18): 181102, 2022 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-36379765

RESUMEN

The vibrational spectra of condensed and gas-phase systems are influenced by thequantum-mechanical behavior of light nuclei. Full-dimensional simulations of approximate quantum dynamics are possible thanks to the imaginary time path-integral (PI) formulation of quantum statistical mechanics, albeit at a high computational cost which increases sharply with decreasing temperature. By leveraging advances in machine-learned coarse-graining, we develop a PI method with the reduced computational cost of a classical simulation. We also propose a simple temperature elevation scheme to significantly attenuate the artifacts of standard PI approaches as well as eliminate the unfavorable temperature scaling of the computational cost. We illustrate the approach, by calculating vibrational spectra using standard models of water molecules and bulk water, demonstrating significant computational savings and dramatically improved accuracy compared to more expensive reference approaches. Our simple, efficient, and accurate method has prospects for routine calculations of vibrational spectra for a wide range of molecular systems - with an explicit treatment of the quantum nature of nuclei.

15.
Nat Chem ; 13(11): 1032-1034, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34707232
16.
J Chem Phys ; 155(8): 084101, 2021 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-34470360

RESUMEN

Accurate modeling of the solvent environment for biological molecules is crucial for computational biology and drug design. A popular approach to achieve long simulation time scales for large system sizes is to incorporate the effect of the solvent in a mean-field fashion with implicit solvent models. However, a challenge with existing implicit solvent models is that they often lack accuracy or certain physical properties compared to explicit solvent models as the many-body effects of the neglected solvent molecules are difficult to model as a mean field. Here, we leverage machine learning (ML) and multi-scale coarse graining (CG) in order to learn implicit solvent models that can approximate the energetic and thermodynamic properties of a given explicit solvent model with arbitrary accuracy, given enough training data. Following the previous ML-CG models CGnet and CGSchnet, we introduce ISSNet, a graph neural network, to model the implicit solvent potential of mean force. ISSNet can learn from explicit solvent simulation data and be readily applied to molecular dynamics simulations. We compare the solute conformational distributions under different solvation treatments for two peptide systems. The results indicate that ISSNet models can outperform widely used generalized Born and surface area models in reproducing the thermodynamics of small protein systems with respect to explicit solvent. The success of this novel method demonstrates the potential benefit of applying machine learning methods in accurate modeling of solvent effects for in silico research and biomedical applications.

17.
Chem Rev ; 121(16): 9719-9721, 2021 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-34428897
18.
Chem Rev ; 121(16): 9722-9758, 2021 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-33945269

RESUMEN

Unsupervised learning is becoming an essential tool to analyze the increasingly large amounts of data produced by atomistic and molecular simulations, in material science, solid state physics, biophysics, and biochemistry. In this Review, we provide a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicate likely directions for further developments in the field. In particular, we discuss feature representation of molecular systems and present state-of-the-art algorithms of dimensionality reduction, density estimation, and clustering, and kinetic models. We divide our discussion into self-contained sections, each discussing a specific method. In each section, we briefly touch upon the mathematical and algorithmic foundations of the method, highlight its strengths and limitations, and describe the specific ways in which it has been used-or can be used-to analyze molecular simulation data.

19.
J Chem Phys ; 154(16): 160401, 2021 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-33940847

RESUMEN

Over recent years, the use of statistical learning techniques applied to chemical problems has gained substantial momentum. This is particularly apparent in the realm of physical chemistry, where the balance between empiricism and physics-based theory has traditionally been rather in favor of the latter. In this guest Editorial for the special topic issue on "Machine Learning Meets Chemical Physics," a brief rationale is provided, followed by an overview of the topics covered. We conclude by making some general remarks.

20.
J Chem Phys ; 154(16): 164113, 2021 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-33940848

RESUMEN

The use of coarse-grained (CG) models is a popular approach to study complex biomolecular systems. By reducing the number of degrees of freedom, a CG model can explore long time- and length-scales inaccessible to computational models at higher resolution. If a CG model is designed by formally integrating out some of the system's degrees of freedom, one expects multi-body interactions to emerge in the effective CG model's energy function. In practice, it has been shown that the inclusion of multi-body terms indeed improves the accuracy of a CG model. However, no general approach has been proposed to systematically construct a CG effective energy that includes arbitrary orders of multi-body terms. In this work, we propose a neural network based approach to address this point and construct a CG model as a multi-body expansion. By applying this approach to a small protein, we evaluate the relative importance of the different multi-body terms in the definition of an accurate model. We observe a slow convergence in the multi-body expansion, where up to five-body interactions are needed to reproduce the free energy of an atomistic model.


Asunto(s)
Oligopéptidos/química , Simulación de Dinámica Molecular , Redes Neurales de la Computación , Termodinámica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA