RESUMO
The ongoing revolution of the natural sciences by the advent of machine learning and artificial intelligence sparked significant interest in the material science community in recent years. The intrinsically high dimensionality of the space of realizable materials makes traditional approaches ineffective for large-scale explorations. Modern data science and machine learning tools developed for increasingly complicated problems are an attractive alternative. An imminent climate catastrophe calls for a clean energy transformation by overhauling current technologies within only several years of possible action available. Tackling this crisis requires the development of new materials at an unprecedented pace and scale. For example, organic photovoltaics have the potential to replace existing silicon-based materials to a large extent and open up new fields of application. In recent years, organic light-emitting diodes have emerged as state-of-the-art technology for digital screens and portable devices and are enabling new applications with flexible displays. Reticular frameworks allow the atom-precise synthesis of nanomaterials and promise to revolutionize the field by the potential to realize multifunctional nanoparticles with applications from gas storage, gas separation, and electrochemical energy storage to nanomedicine. In the recent decade, significant advances in all these fields have been facilitated by the comprehensive application of simulation and machine learning for property prediction, property optimization, and chemical space exploration enabled by considerable advances in computing power and algorithmic efficiency.In this Account, we review the most recent contributions of our group in this thriving field of machine learning for material science. We start with a summary of the most important material classes our group has been involved in, focusing on small molecules as organic electronic materials and crystalline materials. Specifically, we highlight the data-driven approaches we employed to speed up discovery and derive material design strategies. Subsequently, our focus lies on the data-driven methodologies our group has developed and employed, elaborating on high-throughput virtual screening, inverse molecular design, Bayesian optimization, and supervised learning. We discuss the general ideas, their working principles, and their use cases with examples of successful implementations in data-driven material discovery and design efforts. Furthermore, we elaborate on potential pitfalls and remaining challenges of these methods. Finally, we provide a brief outlook for the field as we foresee increasing adaptation and implementation of large scale data-driven approaches in material discovery and design campaigns.
RESUMO
The natural antivitamin 2'-methoxy-thiamine (MTh) is implicated in the suppression of microbial growth. However, its mode of action and enzyme-selective inhibition mechanism have remained elusive. Intriguingly, MTh inhibits some thiamine diphosphate (ThDP) enzymes, while being coenzymatically active in others. Here we report the strong inhibition of Escherichia coli transketolase activity by MTh and unravel its mode of action and the structural basis thereof. The unique 2'-methoxy group of MTh diphosphate (MThDP) clashes with a canonical glutamate required for cofactor activation in ThDP-dependent enzymes. This glutamate is forced into a stable, anticatalytic low-barrier hydrogen bond with a neighboring glutamate, disrupting cofactor activation. Molecular dynamics simulations of transketolases and other ThDP enzymes identify active-site flexibility and the topology of the cofactor-binding locale as key determinants for enzyme-selective inhibition. Human enzymes either retain enzymatic activity with MThDP or preferentially bind authentic ThDP over MThDP, while core bacterial metabolic enzymes are inhibited, demonstrating therapeutic potential.
Assuntos
Antibacterianos/metabolismo , Inibidores Enzimáticos/metabolismo , Tiamina/metabolismo , Transcetolase/antagonistas & inibidores , Sequência de Aminoácidos , Antibacterianos/farmacologia , Domínio Catalítico , Coenzimas/metabolismo , Desenho de Fármacos , Inibidores Enzimáticos/farmacologia , Escherichia coli/enzimologia , Ácido Glutâmico/metabolismo , Humanos , Ligação de Hidrogênio , Cinética , Simulação de Dinâmica Molecular , Estrutura Molecular , Ligação Proteica , Relação Estrutura-Atividade , Especificidade por Substrato , Tiamina/farmacologia , Tiamina Pirofosfato/metabolismo , Transcetolase/genéticaRESUMO
High-throughput virtual screening is an indispensable technique utilized in the discovery of small molecules. In cases where the library of molecules is exceedingly large, the cost of an exhaustive virtual screen may be prohibitive. Model-guided optimization has been employed to lower these costs through dramatic increases in sample efficiency compared to random selection. However, these techniques introduce new costs to the workflow through the surrogate model training and inference steps. In this study, we propose an extension to the framework of model-guided optimization that mitigates inference costs using a technique we refer to as design space pruning (DSP), which irreversibly removes poor-performing candidates from consideration. We study the application of DSP to a variety of optimization tasks and observe significant reductions in overhead costs while exhibiting similar performance to the baseline optimization. DSP represents an attractive extension of model-guided optimization that can limit overhead costs in optimization settings where these costs are non-negligible relative to objective costs, such as docking.
Assuntos
Ensaios de Triagem em Larga Escala , Fluxo de TrabalhoRESUMO
In molecular discovery and drug design, structure-property relationships and activity landscapes are often qualitatively or quantitatively analyzed to guide the navigation of chemical space. The roughness (or smoothness) of these molecular property landscapes is one of their most studied geometric attributes, as it can characterize the presence of activity cliffs, with rougher landscapes generally expected to pose tougher optimization challenges. Here, we introduce a general, quantitative measure for describing the roughness of molecular property landscapes. The proposed roughness index (ROGI) is loosely inspired by the concept of fractal dimension and strongly correlates with the out-of-sample error achieved by machine learning models on numerous regression tasks.
Assuntos
Desenho de Fármacos , Aprendizado de MáquinaRESUMO
Approaches for computing small molecule binding free energies based on molecular simulations are now regularly being employed by academic and industry practitioners to study receptor-ligand systems and prioritize the synthesis of small molecules for ligand design. Given the variety of methods and implementations available, it is natural to ask how the convergence rates and final predictions of these methods compare. In this study, we describe the concept and results for the SAMPL6 SAMPLing challenge, the first challenge from the SAMPL series focusing on the assessment of convergence properties and reproducibility of binding free energy methodologies. We provided parameter files, partial charges, and multiple initial geometries for two octa-acid (OA) and one cucurbit[8]uril (CB8) host-guest systems. Participants submitted binding free energy predictions as a function of the number of force and energy evaluations for seven different alchemical and physical-pathway (i.e., potential of mean force and weighted ensemble of trajectories) methodologies implemented with the GROMACS, AMBER, NAMD, or OpenMM simulation engines. To rank the methods, we developed an efficiency statistic based on bias and variance of the free energy estimates. For the two small OA binders, the free energy estimates computed with alchemical and potential of mean force approaches show relatively similar variance and bias as a function of the number of energy/force evaluations, with the attach-pull-release (APR), GROMACS expanded ensemble, and NAMD double decoupling submissions obtaining the greatest efficiency. The differences between the methods increase when analyzing the CB8-quinine system, where both the guest size and correlation times for system dynamics are greater. For this system, nonequilibrium switching (GROMACS/NS-DS/SB) obtained the overall highest efficiency. Surprisingly, the results suggest that specifying force field parameters and partial charges is insufficient to generally ensure reproducibility, and we observe differences between seemingly converged predictions ranging approximately from 0.3 to 1.0 kcal/mol, even with almost identical simulations parameters and system setup (e.g., Lennard-Jones cutoff, ionic composition). Further work will be required to completely identify the exact source of these discrepancies. Among the conclusions emerging from the data, we found that Hamiltonian replica exchange-while displaying very small variance-can be affected by a slowly-decaying bias that depends on the initial population of the replicas, that bidirectional estimators are significantly more efficient than unidirectional estimators for nonequilibrium free energy calculations for systems considered, and that the Berendsen barostat introduces non-negligible artifacts in expanded ensemble simulations.
Assuntos
Compostos Macrocíclicos/química , Proteínas/química , Solventes/química , Termodinâmica , Hidrocarbonetos Aromáticos com Pontes/química , Entropia , Imidazóis/química , Ligantes , Fenômenos Físicos , Ligação Proteica , Teoria QuânticaRESUMO
Binding selectivity is a requirement for the development of a safe drug, and it is a critical property for chemical probes used in preclinical target validation. Engineering selectivity adds considerable complexity to the rational design of new drugs, as it involves the optimization of multiple binding affinities. Computationally, the prediction of binding selectivity is a challenge, and generally applicable methodologies are still not available to the computational and medicinal chemistry communities. Absolute binding free energy calculations based on alchemical pathways provide a rigorous framework for affinity predictions and could thus offer a general approach to the problem. We evaluated the performance of free energy calculations based on molecular dynamics for the prediction of selectivity by estimating the affinity profile of three bromodomain inhibitors across multiple bromodomain families, and by comparing the results to isothermal titration calorimetry data. Two case studies were considered. In the first one, the affinities of two similar ligands for seven bromodomains were calculated and returned excellent agreement with experiment (mean unsigned error of 0.81 kcal/mol and Pearson correlation of 0.75). In this test case, we also show how the preferred binding orientation of a ligand for different proteins can be estimated via free energy calculations. In the second case, the affinities of a broad-spectrum inhibitor for 22 bromodomains were calculated and returned a more modest accuracy (mean unsigned error of 1.76 kcal/mol and Pearson correlation of 0.48); however, the reparametrization of a sulfonamide moiety improved the agreement with experiment.
RESUMO
Binding free energy calculations that make use of alchemical pathways are becoming increasingly feasible thanks to advances in hardware and algorithms. Although relative binding free energy (RBFE) calculations are starting to find widespread use, absolute binding free energy (ABFE) calculations are still being explored mainly in academic settings due to the high computational requirements and still uncertain predictive value. However, in some drug design scenarios, RBFE calculations are not applicable and ABFE calculations could provide an alternative. Computationally cheaper end-point calculations in implicit solvent, such as molecular mechanics Poisson-Boltzmann surface area (MMPBSA) calculations, could too be used if one is primarily interested in a relative ranking of affinities. Here, we compare MMPBSA calculations to previously performed absolute alchemical free energy calculations in their ability to correlate with experimental binding free energies for three sets of bromodomain-inhibitor pairs. Different MMPBSA approaches have been considered, including a standard single-trajectory protocol, a protocol that includes a binding entropy estimate, and protocols that take into account the ligand hydration shell. Despite the improvements observed with the latter two MMPBSA approaches, ABFE calculations were found to be overall superior in obtaining correlation with experimental affinities for the test cases considered. A difference in weighted average Pearson ([Formula: see text]) and Spearman ([Formula: see text]) correlations of 0.25 and 0.31 was observed when using a standard single-trajectory MMPBSA setup ([Formula: see text] = 0.64 and [Formula: see text] = 0.66 for ABFE; [Formula: see text] = 0.39 and [Formula: see text] = 0.35 for MMPBSA). The best performing MMPBSA protocols returned weighted average Pearson and Spearman correlations that were about 0.1 inferior to ABFE calculations: [Formula: see text] = 0.55 and [Formula: see text] = 0.56 when including an entropy estimate, and [Formula: see text] = 0.53 and [Formula: see text] = 0.55 when including explicit water molecules. Overall, the study suggests that ABFE calculations are indeed the more accurate approach, yet there is also value in MMPBSA calculations considering the lower compute requirements, and if agreement to experimental affinities in absolute terms is not of interest. Moreover, for the specific protein-ligand systems considered in this study, we find that including an explicit ligand hydration shell or a binding entropy estimate in the MMPBSA calculations resulted in significant performance improvements at a negligible computational cost.
Assuntos
Entropia , Simulação de Dinâmica Molecular , Bases de Dados de Proteínas , Domínios ProteicosRESUMO
The understanding of binding interactions between any protein and a small molecule plays a key role in the rationalization of affinity and selectivity and is essential for an efficient structure-based drug discovery (SBDD) process. Clearly, to begin SBDD, a structure is needed, and although there has been fantastic progress in solving G-protein-coupled receptor (GPCR) crystal structures, the process remains quite slow and is not currently feasible for every GPCR or GPCR-ligand complex. This situation significantly limits the ability of X-ray crystallography to impact the drug discovery process for GPCR targets in 'real-time' and hence there is still a need for other practical and cost-efficient alternatives. We present here an approach that integrates our previously described hierarchical GPCR modelling protocol (HGMP) and the fragment molecular orbital (FMO) quantum mechanics (QM) method to explore the interactions and selectivity of the human orexin-2 receptor (OX2R) and its recently discovered nonpeptidic agonists. HGMP generates a 3D model of GPCR structures and its complexes with small molecules by applying a set of computational methods. FMO allowsab initioapproaches to be applied to systems that conventional QM methods would find challenging. The key advantage of FMO is that it can reveal information on the individual contribution and chemical nature of each residue and water molecule to the ligand binding that normally would be difficult to detect without QM. We illustrate how the combination of both techniques provides a practical and efficient approach that can be used to analyse the existing structure-function relationships (SAR) and to drive forward SBDD in a real-world example for which there is no crystal structure of the complex available.
Assuntos
Orexinas/metabolismo , Receptores Acoplados a Proteínas G/metabolismo , Cristalografia por Raios X , Humanos , Modelos Moleculares , Conformação Proteica , Receptores Acoplados a Proteínas G/agonistas , Receptores Acoplados a Proteínas G/químicaRESUMO
Our interpretation of ligand-protein interactions is often informed by high-resolution structures, which represent the cornerstone of structure-based drug design. However, visual inspection and molecular mechanics approaches cannot explain the full complexity of molecular interactions. Quantum Mechanics approaches are often too computationally expensive, but one method, Fragment Molecular Orbital (FMO), offers an excellent compromise and has the potential to reveal key interactions that would otherwise be hard to detect. To illustrate this, we have applied the FMO method to 18 Class A GPCR-ligand crystal structures, representing different branches of the GPCR genome. Our work reveals key interactions that are often omitted from structure-based descriptions, including hydrophobic interactions, nonclassical hydrogen bonds, and the involvement of backbone atoms. This approach provides a more comprehensive picture of receptor-ligand interactions than is currently used and should prove useful for evaluation of the chemical nature of ligand binding and to support structure-based drug design.
Assuntos
Modelos Moleculares , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/metabolismo , Animais , Humanos , Ligação de Hidrogênio , Ligantes , Preparações Farmacêuticas/química , Preparações Farmacêuticas/metabolismo , Ligação Proteica , Conformação Proteica , RatosRESUMO
Most of the previous content of this book has focused on obtaining the structures of membrane proteins. In this chapter we explore how those structures can be further used in two key ways. The first is their use in structure based drug design (SBDD) and the second is how they can be used to extend our understanding of their functional activity via the use of molecular dynamics. Both aspects now heavily rely on computations. This area is vast, and alas, too large to consider in depth in a single book chapter. Thus where appropriate we have referred the reader to recent reviews for deeper assessment of the field. We discuss progress via the use of examples from two main drug target areas; G-protein coupled receptors (GPCRs) and ion channels. We end with a discussion of some of the main challenges in the area.
Assuntos
Descoberta de Drogas/métodos , Proteínas de Membrana/química , Desenho de Fármacos , Previsões , Antagonistas dos Receptores Histamínicos H3/química , Antagonistas dos Receptores Histamínicos H3/farmacologia , Humanos , Cinética , Modelos Moleculares , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Terapia de Alvo Molecular , Obesidade/tratamento farmacológico , Receptores de Orexina/efeitos dos fármacos , Ligação Proteica , Conformação Proteica , Receptores Acoplados a Proteínas G/antagonistas & inibidores , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/efeitos dos fármacos , Receptores Histamínicos , Receptores Histamínicos H4 , Receptores de Somatostatina/antagonistas & inibidores , Agonistas do Receptor 5-HT2 de Serotonina/química , Agonistas do Receptor 5-HT2 de Serotonina/farmacologia , Relação Estrutura-Atividade , ÁguaRESUMO
Long-acting injectables are considered one of the most promising therapeutic strategies for the treatment of chronic diseases as they can afford improved therapeutic efficacy, safety, and patient compliance. The use of polymer materials in such a drug formulation strategy can offer unparalleled diversity owing to the ability to synthesize materials with a wide range of properties. However, the interplay between multiple parameters, including the physicochemical properties of the drug and polymer, make it very difficult to intuitively predict the performance of these systems. This necessitates the development and characterization of a wide array of formulation candidates through extensive and time-consuming in vitro experimentation. Machine learning is enabling leap-step advances in a number of fields including drug discovery and materials science. The current study takes a critical step towards data-driven drug formulation development with an emphasis on long-acting injectables. Here we show that machine learning algorithms can be used to predict experimental drug release from these advanced drug delivery systems. We also demonstrate that these trained models can be used to guide the design of new long acting injectables. The implementation of the described data-driven approach has the potential to reduce the time and cost associated with drug formulation development.
Assuntos
Sistemas de Liberação de Medicamentos , Polímeros , Humanos , Injeções , Liberação Controlada de Fármacos , Aprendizado de MáquinaRESUMO
Computer-aided molecular design benefits from the integration of two complementary approaches: machine learning and first-principles simulation. Mohr et al. (B. Mohr, K. Shmilovich, I. S. Kleinwächter, D. Schneider, A. L. Ferguson and T. Bereau, Chem. Sci., 2022, 13, 4498-4511, https://pubs.rsc.org/en/content/articlelanding/2022/sc/d2sc00116k) demonstrated the discovery of a cardiolipin-selective molecule via the combination of coarse-grained molecular dynamics, alchemical free energy calculations, Bayesian optimization and interpretable regression to reveal design principles.
RESUMO
Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have favorable properties. However, in contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules, which poses unique challenges to traditional chemical representations and machine learning approaches. Here, we introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction. We demonstrate that this approach captures critical features of polymeric materials, like chain architecture, monomer stoichiometry, and degree of polymerization, and achieves superior accuracy to off-the-shelf cheminformatics methodologies. While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches. The dataset and machine learning models presented in this work pave the path toward new classes of algorithms for polymer informatics and, more broadly, introduce a framework for the modeling of molecular ensembles.
RESUMO
An oracle that correctly predicts the outcome of every particle physics experiment, the products of every possible chemical reaction or the function of every protein would revolutionize science and technology. However, scientists would not be entirely satisfied because they would want to comprehend how the oracle made these predictions. This is scientific understanding, one of the main aims of science. With the increase in the available computational power and advances in artificial intelligence, a natural question arises: how can advanced computational systems, and specifically artificial intelligence, contribute to new scientific understanding or gain it autonomously? Trying to answer this question, we adopted a definition of 'scientific understanding' from the philosophy of science that enabled us to overview the scattered literature on the topic and, combined with dozens of anecdotes from scientists, map out three dimensions of computer-assisted scientific understanding. For each dimension, we review the existing state of the art and discuss future developments. We hope that this Perspective will inspire and focus research directions in this multidisciplinary emerging field.
RESUMO
Machine learning (ML) has enabled ground-breaking advances in the healthcare and pharmaceutical sectors, from improvements in cancer diagnosis, to the identification of novel drugs and drug targets as well as protein structure prediction. Drug formulation is an essential stage in the discovery and development of new medicines. Through the design of drug formulations, pharmaceutical scientists can engineer important properties of new medicines, such as improved bioavailability and targeted delivery. The traditional approach to drug formulation development relies on iterative trial-and-error, requiring a large number of resource-intensive and time-consuming in vitro and in vivo experiments. This review introduces the basic concepts of ML-directed workflows and discusses how these tools can be used to aid in the development of various types of drug formulations. ML-directed drug formulation development offers unparalleled opportunities to fast-track development efforts, uncover new materials, innovative formulations, and generate new knowledge in drug formulation science. The review also highlights the latest artificial intelligence (AI) technologies, such as generative models, Bayesian deep learning, reinforcement learning, and self-driving laboratories, which have been gaining momentum in drug discovery and chemistry and have potential in drug formulation development.
Assuntos
Composição de Medicamentos/métodos , Aprendizado de Máquina , Animais , Sistemas de Liberação de Medicamentos , Desenvolvimento de Medicamentos/métodos , HumanosRESUMO
Numerous challenges in science and engineering can be framed as optimization tasks, including the maximization of reaction yields, the optimization of molecular and materials properties, and the fine-tuning of automated hardware protocols. Design of experiment and optimization algorithms are often adopted to solve these tasks efficiently. Increasingly, these experiment planning strategies are coupled with automated hardware to enable autonomous experimental platforms. The vast majority of the strategies used, however, do not consider robustness against the variability of experiment and process conditions. In fact, it is generally assumed that these parameters are exact and reproducible. Yet some experiments may have considerable noise associated with some of their conditions, and process parameters optimized under precise control may be applied in the future under variable operating conditions. In either scenario, the optimal solutions found might not be robust against input variability, affecting the reproducibility of results and returning suboptimal performance in practice. Here, we introduce Golem, an algorithm that is agnostic to the choice of experiment planning strategy and that enables robust experiment and process optimization. Golem identifies optimal solutions that are robust to input uncertainty, thus ensuring the reproducible performance of optimized experimental protocols and processes. It can be used to analyze the robustness of past experiments, or to guide experiment planning algorithms toward robust solutions on the fly. We assess the performance and domain of applicability of Golem through extensive benchmark studies and demonstrate its practical relevance by optimizing an analytical chemistry protocol under the presence of significant noise in its experimental conditions.
RESUMO
The accurate calculation of the binding free energy for arbitrary ligand-protein pairs is a considerable challenge in computer-aided drug discovery. Recently, it has been demonstrated that current state-of-the-art molecular dynamics (MD) based methods are capable of making highly accurate predictions. Conventional MD-based approaches rely on the first principles of statistical mechanics and assume equilibrium sampling of the phase space. In the current work we demonstrate that accurate absolute binding free energies (ABFE) can also be obtained via theoretically rigorous non-equilibrium approaches. Our investigation of ligands binding to bromodomains and T4 lysozyme reveals that both equilibrium and non-equilibrium approaches converge to the same results. The non-equilibrium approach achieves the same level of accuracy and convergence as an equilibrium free energy perturbation (FEP) method enhanced by Hamiltonian replica exchange. We also compare uni- and bi-directional non-equilibrium approaches and demonstrate that considering the work distributions from both forward and reverse directions provides substantial accuracy gains. In summary, non-equilibrium ABFE calculations are shown to yield reliable and well-converged estimates of protein-ligand binding affinity.
RESUMO
Introduction: Computational modeling has rapidly advanced over the last decades. Recently, machine learning has emerged as a powerful and cost-effective strategy to learn from existing datasets and perform predictions on unseen molecules. Accordingly, the explosive rise of data-driven techniques raises an important question: What confidence can be assigned to molecular property predictions and what techniques can be used?Areas covered: The authors discuss popular strategies for predicting molecular properties, their corresponding uncertainty sources and methods to quantify uncertainty. First, the authors' considerations for assessing confidence begin with dataset bias and size, data-driven property prediction and feature design. Next, the authors discuss property simulation via computations of binding affinity in detail. Lastly, they investigate how these uncertainties propagate to generative models, as they are usually coupled with property predictors.Expert opinion: Computational techniques are paramount to reduce the prohibitive cost of brute-force experimentation during exploration. The authors believe that assessing uncertainty in property prediction models is essential whenever closed-loop drug design campaigns relying on high-throughput virtual screening are deployed. Accordingly, considering sources of uncertainty leads to better-informed validations, more reliable predictions and more realistic expectations of the entire workflow. Overall, this increases confidence in the predictions and, ultimately, accelerates drug design.
Assuntos
Desenho de Fármacos , Aprendizado de Máquina , Simulação por Computador , Humanos , IncertezaRESUMO
G-protein coupled receptors (GPCRs) are the largest superfamily of membrane proteins, regulating almost every aspect of cellular activity and serving as key targets for drug discovery. We have identified an accurate and reliable computational method to characterize the strength and chemical nature of the interhelical interactions between the residues of transmembrane (TM) domains during different receptor activation states, something that cannot be characterized solely by visual inspection of structural information. Using the fragment molecular orbital (FMO) quantum mechanics method to analyze 35 crystal structures representing different branches of the class A GPCR family, we have identified 69 topologically equivalent TM residues that form a consensus network of 51 inter-TM interactions, providing novel results that are consistent with and help to rationalize experimental data. This discovery establishes a comprehensive picture of how defined molecular forces govern specific interhelical interactions which, in turn, support the structural stability, ligand binding, and activation of GPCRs.
Assuntos
Receptores Acoplados a Proteínas G/química , Ligantes , Ligação Proteica , Conformação Proteica , Teoria QuânticaRESUMO
Molecular dynamics based free energy calculations allow for a robust and accurate evaluation of free energy changes upon amino acid mutation in proteins. In this chapter we cover the basic theoretical concepts important for the use of calculations utilizing the non-equilibrium alchemical switching methodology. We further provide a detailed step-by-step protocol for estimating the effect of a single amino acid mutation on protein thermostability. In addition, the potential caveats and solutions to some frequently encountered issues concerning the non-equilibrium alchemical free energy calculations are discussed. The protocol comprises details for the hybrid structure/topology generation required for alchemical transitions, equilibrium simulation setup, and description of the fast non-equilibrium switching. Subsequently, the analysis of the obtained results is described. The steps in the protocol are complemented with an illustrative practical application: a destabilizing mutation in the Trp cage mini protein. The concepts that are described are generally applicable. The shown example makes use of the pmx software package for the free energy calculations using Gromacs as a molecular dynamics engine. Finally, we discuss how the current protocol can readily be adapted to carry out charge-changing or multiple mutations at once, as well as large-scale mutational scans.