Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 91
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38586024

RESUMO

The engineering of novel protein-ligand binding interactions, particularly for complex drug-like molecules, is an unsolved problem which could enable many practical applications of protein biosensors. In this work, we analyzed two engineer ed biosensors, derived from the plant hormone sensor PYR1, to recognize either the agrochemical mandipropamid or the synthetic cannabinoid WIN55,212-2. Using a combination of quantitative deep mutational scanning experiments and molecular dynamics simulations, we demonstrated that mutations at common positions can promote protein-ligand shape complementarity and revealed prominent differences in the electrostatic networks needed to complement diverse ligands. MD simulations indicate that both PYR1 protein-ligand complexes bind a single conformer of their target ligand that is close to the lowest free energy conformer. Computational design using a fixed conformer and rigid body orientation led to new WIN55,212-2 sensors with nanomolar limits of detection. This work reveals mechanisms by which the versatile PYR1 biosensor scaffold can bind diverse ligands. This work also provides computational methods to sample realistic ligand conformers and rigid body alignments that simplify the computational design of biosensors for novel ligands of interest.

2.
Biophys J ; 123(6): 703-717, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38356260

RESUMO

Liquid-liquid phase separation (LLPS) is thought to be a main driving force in the formation of membraneless organelles. Examples of such organelles include the centrosome, central spindle, and stress granules. Recently, it has been shown that coiled-coil (CC) proteins, such as the centrosomal proteins pericentrin, spd-5, and centrosomin, might be capable of LLPS. CC domains have physical features that could make them the drivers of LLPS, but it is unknown if they play a direct role in the process. We developed a coarse-grained simulation framework for investigating the LLPS propensity of CC proteins, in which interactions that support LLPS arise solely from CC domains. We show, using this framework, that the physical features of CC domains are sufficient to drive LLPS of proteins. The framework is specifically designed to investigate how the number of CC domains, as well as the multimerization state of CC domains, can affect LLPS. We show that small model proteins with as few as two CC domains can phase separate. Increasing the number of CC domains up to four per protein can somewhat increase LLPS propensity. We demonstrate that trimer-forming and tetramer-forming CC domains have a dramatically higher LLPS propensity than dimer-forming coils, which shows that multimerization state has a greater effect on LLPS than the number of CC domains per protein. These data support the hypothesis of CC domains as drivers of protein LLPS, and have implications in future studies to identify the LLPS-driving regions of centrosomal and central spindle proteins.


Assuntos
Proteínas Intrinsicamente Desordenadas , Proteínas Intrinsicamente Desordenadas/metabolismo , Separação de Fases , Domínios Proteicos , Organelas/metabolismo
3.
J Chem Inf Model ; 64(4): 1290-1305, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38303159

RESUMO

Polymer and chemically modified biopolymer systems present unique challenges to traditional molecular simulation preparation workflows. First, typical polymer and biomolecular input formats, such as Protein Data Bank (PDB) files, lack adequate chemical information needed for the parameterization of new chemistries. Second, polymers are typically too large for accurate partial charge generation methods. In this work, we employ direct chemical perception through the Open Force Field toolkit to create a flexible polymer simulation workflow for organic polymers, encompassing everything from biopolymers to soft materials. We propose and test a new input specification for monomer information that can, along with a 3D conformational geometry, parametrize and simulate most soft-material systems within the same workflow used for smaller ligands. The monomer format encompasses a subset of the SMIRKS substructure query language to uniquely identify chemical information and repeating charges in underspecified systems through matching atomic connectivity. This workflow is combined with several different approaches for automatic partial-charge generation for larger systems. As an initial proof of concept, a variety of diverse polymeric systems were parametrized with the Open Force Field toolkit, including functionalized proteins, DNA, homopolymers, cross-linked systems, and sugars. Additionally, shape properties and radial distribution functions were computed from molecular dynamics simulations of poly(ethylene glycol), polyacrylamide, and poly(N-isopropylacrylamide) homopolymers in aqueous solution and compared to previous simulation results in order to demonstrate a start-to-finish workflow for simulation and property prediction. We expect that these tools will greatly expedite the day-to-day computational research of soft-matter simulations and create a robust atomic-scale polymer specification in conjunction with existing polymer structural notations.


Assuntos
Simulação de Dinâmica Molecular , Polímeros , Polímeros/química , Biopolímeros , Proteínas/química , Conformação Molecular
4.
Nat Commun ; 14(1): 7973, 2023 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-38042897

RESUMO

Membraneless liquid compartments based on phase-separating biopolymers have been observed in diverse cell types and attributed to weak multivalent interactions predominantly based on intrinsically disordered domains. The design of liquid-liquid phase separated (LLPS) condensates based on de novo designed tunable modules that interact in a well-understood, controllable manner could improve our understanding of this phenomenon and enable the introduction of new features. Here we report the construction of CC-LLPS in mammalian cells, based on designed coiled-coil (CC) dimer-forming modules, where the stability of CC pairs, their number, linkers, and sequential arrangement govern the transition between diffuse, liquid and immobile condensates and are corroborated by coarse-grained molecular simulations. Through modular design, we achieve multiple coexisting condensates, chemical regulation of LLPS, condensate fusion, formation from either one or two polypeptide components or LLPS regulation by a third polypeptide chain. These findings provide further insights into the principles underlying LLPS formation and a design platform for controlling biological processes.


Assuntos
Proteínas Intrinsicamente Desordenadas , Peptídeos , Animais , Proteínas Intrinsicamente Desordenadas/metabolismo , Mamíferos/metabolismo
5.
J Phys Chem B ; 127(39): 8305-8316, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37729547

RESUMO

Protein tyrosine phosphatases (PTPs) are emerging drug targets for many diseases, including cancer, autoimmunity, and neurological disorders. A high degree of structural similarity between their catalytic domains, however, has hindered the development of selective pharmacological agents. Our previous research uncovered two unfunctionalized terpenoid inhibitors that selectively inhibit PTP1B over T-cell PTP (TCPTP), two PTPs with high sequence conservation. Here, we use molecular modeling, with supporting experimental validation, to study the molecular basis of this unusual selectivity. Molecular dynamics (MD) simulations suggest that PTP1B and TCPTP share a h-bond network that connects the active site to a distal allosteric pocket; this network stabilizes the closed conformation of the catalytically essential WPD loop, which it links to the L-11 loop and neighboring α3 and α7 helices on the other side of the catalytic domain. Terpenoid binding to either of two proximal C-terminal sites─an α site and a ß site─can disrupt the allosteric network; however, binding to the α site forms a stable complex only in PTP1B. In TCPTP, two charged residues disfavor binding at the α site in favor of binding at the ß site, which is conserved between the two proteins. Our findings thus indicate that minor amino acid differences at the poorly conserved α site enable selective binding, a property that might be enhanced with chemical elaboration, and illustrate more broadly how minor differences in the conservation of neighboring─yet functionally similar─allosteric sites can affect the selectivity of inhibitory scaffolds (e.g., fragments).


Assuntos
Simulação de Dinâmica Molecular , Linfócitos T , Linfócitos T/metabolismo , Domínio Catalítico , Sítio Alostérico , Estrutura Secundária de Proteína , Proteínas Tirosina Fosfatases/química , Inibidores Enzimáticos/química
6.
bioRxiv ; 2023 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-37398035

RESUMO

Liquid-liquid phase separation (LLPS) is thought to be a main driving force in the formation of membraneless organelles. Examples of such organelles include the centrosome, central spindle, and stress granules. Recently, it has been shown that coiled-coil (CC) proteins, such as the centrosomal proteins pericentrin, spd-5, and centrosomin, might be capable of LLPS. CC domains have physical features that could make them the drivers of LLPS, but it is unknown if they play a direct role in the process. We developed a coarse-grained simulation framework for investigating the LLPS propensity of CC proteins, in which interactions which support LLPS arise solely from CC domains. We show, using this framework, that the physical features of CC domains are sufficient to drive LLPS of proteins. The framework is specifically designed to investigate how the number of CC domains, as well as multimerization state of CC domains, can affect LLPS. We show that small model proteins with as few as two CC domains can phase separate. Increasing the number of CC domains up to four per protein can somewhat increase LLPS propensity. We demonstrate that trimer-forming and tetramer-forming CC domains have a dramatically higher LLPS propensity than dimer-forming coils, which shows that multimerization state has a greater effect on LLPS than the number of CC domains per protein. These data support the hypothesis of CC domains as drivers of protein LLPS, and has implications in future studies to identify the LLPS-driving regions of centrosomal and central spindle proteins.

7.
Protein Sci ; 32(8): e4719, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37402140

RESUMO

Neutral mutational drift is an important source of biological diversity that remains underexploited in fundamental studies of protein biophysics. This study uses a synthetic transcriptional circuit to study neutral drift in protein tyrosine phosphatase 1B (PTP1B), a mammalian signaling enzyme for which conformational changes are rate limiting. Kinetic assays of purified mutants indicate that catalytic activity, rather than thermodynamic stability, guides enrichment under neutral drift, where neutral or mildly activating mutations can mitigate the effects of deleterious ones. In general, mutants show a moderate activity-stability tradeoff, an indication that minor improvements in the activity of PTP1B do not require concomitant losses in its stability. Multiplexed sequencing of large mutant pools suggests that substitutions at allosterically influential sites are purged under biological selection, which enriches for mutations located outside of the active site. Findings indicate that the positional dependence of neutral mutations within drifting populations can reveal the presence of allosteric networks and illustrate an approach for using synthetic transcriptional systems to explore these mutations in regulatory enzymes.


Assuntos
Mamíferos , Proteínas , Animais , Mutação , Domínio Catalítico , Sítio Alostérico
8.
Digit Discov ; 2(3): 828-847, 2023 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-37312680

RESUMO

Accurate representations of van der Waals dispersion-repulsion interactions play an important role in high-quality molecular dynamics simulations. Training the force field parameters used in the Lennard Jones (LJ) potential typically used to represent these interactions is challenging, generally requiring adjustment based on simulations of macroscopic physical properties. The large computational expense of these simulations, especially when many parameters must be trained simultaneously, limits the size of training data set and number of optimization steps that can be taken, often requiring modelers to perform optimizations within a local parameter region. To allow for more global LJ parameter optimization against large training sets, we introduce a multi-fidelity optimization technique which uses Gaussian process surrogate modeling to build inexpensive models of physical properties as a function of LJ parameters. This approach allows for fast evaluation of approximate objective functions, greatly accelerating searches over parameter space and enabling the use of optimization algorithms capable of searching more globally. In this study, we use an iterative framework which performs global optimization with differential evolution at the surrogate level, followed by validation at the simulation level and surrogate refinement. Using this technique on two previously studied training sets, containing up to 195 physical property targets, we refit a subset of the LJ parameters for the OpenFF 1.0.0 (Parsley) force field. We demonstrate that this multi-fidelity technique can find improved parameter sets compared to a purely simulation-based optimization by searching more broadly and escaping local minima. Additionally, this technique often finds significantly different parameter minima that have comparably accurate performance. In most cases, these parameter sets are transferable to other similar molecules in a test set. Our multi-fidelity technique provides a platform for rapid, more global optimization of molecular models against physical properties, as well as a number of opportunities for further refinement of the technique.

9.
bioRxiv ; 2023 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-37131728

RESUMO

Protein tyrosine phosphatases (PTPs) are emerging drug targets for many diseases, including type 2 diabetes, obesity, and cancer. However, a high degree of structural similarity between the catalytic domains of these enzymes has made the development of selective pharmacological inhibitors an enormous challenge. Our previous research uncovered two unfunctionalized terpenoid inhibitors that selectively inhibit PTP1B over TCPTP, two PTPs with high sequence conservation. Here, we use molecular modeling with experimental validation to study the molecular basis of this unusual selectivity. Molecular dynamics (MD) simulations indicate that PTP1B and TCPTP contain a conserved h-bond network that connects the active site to a distal allosteric pocket; this network stabilizes the closed conformation of the catalytically influential WPD loop, which it links to the L-11 loop and α 3 and α 7 helices-the C-terminal side of the catalytic domain. Terpenoid binding to either of two proximal allosteric sites-an α site and a ß site-can disrupt the allosteric network. Interestingly, binding to the α site forms a stable complex with only PTP1B; in TCPTP, where two charged residues disfavor binding at the α site, the terpenoids bind to the ß site, which is conserved between the two proteins. Our findings indicate that minor amino acid differences at the poorly conserved α site enable selective binding, a property that might be enhanced with chemical elaboration, and illustrate, more broadly, how minor differences in the conservation of neighboring-yet functionally similar-allosteric sites can have very different implications for inhibitor selectivity.

10.
J Chem Theory Comput ; 19(11): 3251-3275, 2023 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-37167319

RESUMO

We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, code-named Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein-ligand simulations to be compatible with AMBER biopolymer force fields. In this work, we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root-mean-square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔE). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as ΔHmix, ρ(x), ΔGsolv, and ΔGtrans. Additionally, we benchmarked against protein-ligand binding free energies (ΔGbind), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.


Assuntos
Benchmarking , Proteínas , Ligantes , Proteínas/química , Termodinâmica , Entropia
11.
J Chem Theory Comput ; 19(6): 1805-1817, 2023 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-36853624

RESUMO

Performing alchemical transformations, in which one molecular system is nonphysically changed to another system, is a popular approach adopted in performing free energy calculations associated with various biophysical processes, such as protein-ligand binding or the transfer of a molecule between environments. While the sampling of alchemical intermediate states in either parallel (e.g., Hamiltonian replica exchange) or serial manner (e.g., expanded ensemble) can bridge the high-probability regions in the configurational space between two end states of interest, alchemical methods can fail in scenarios where the most important slow degrees of freedom in the configurational space are, in large part, orthogonal to the alchemical variable, or if the system gets trapped in a deep basin extending in both the configurational and alchemical space. To alleviate these issues, we propose to use alchemical variables as an additional dimension in metadynamics, making it possible to both sample collective variables and to enhance sampling in free energy calculations simultaneously. In this study, we validate our implementation of "alchemical metadynamics" in PLUMED with test systems and alchemical processes with varying complexities and dimensionalities of collective variable space, including the interconversion between the torsional metastable states of a toy system and the methylation of a nucleoside both in the isolated form and in a duplex. We show that multidimensional alchemical metadynamics can address the challenges mentioned above and further accelerate sampling by introducing configurational collective variables. The method can trivially be combined with other metadynamics-based algorithms implemented in PLUMED. The necessary PLUMED code changes have already been released for general use in PLUMED 2.8.

12.
J Phys Chem B ; 126(48): 10098-10110, 2022 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-36417348

RESUMO

Amphiphilic monomers in polar solvents can self-assemble into lyotropic liquid crystal (LLC) bicontinuous cubic structures under the right composition and temperature conditions. After cross-linking, the resulting polymer membranes with three-dimensional (3D) continuous uniform channels are excellent candidates for filtration applications. Designing such membranes with the desired physical and chemical properties requires molecular-level understanding of the structure, which can be obtained through molecular modeling. However, building molecular models of bicontinuous cubic structures is challenging due to their narrow regime of stability and the difficulty of self-assembly of large unit cells in molecular simulations. We developed a protocol for building stable bicontinuous cubic unit cells involving both parameterization and assembly of the components. We validate the theoretical structure against experimental results for one such LLC monomer and provide insight into the structure missing in experimental data, as well as demonstrate the qualitative nature of water and solute transport through these membranes.


Assuntos
Cristais Líquidos
13.
Artigo em Inglês | MEDLINE | ID: mdl-36337282

RESUMO

Molecular simulations such as molecular dynamics (MD) and Monte Carlo (MC) simulations are powerful tools allowing the prediction of experimental observables in the study of systems such as proteins, membranes, and polymeric materials. The quality of predictions based on molecular simulations depend on the validity of the underlying physical assumptions. physical_validation allows users of molecular simulation programs to perform simple yet powerful tests of physical validity on their systems and setups. It can also be used by molecular simulation package developers to run representative test systems during development, increasing code correctness. The theoretical foundation of the physical validation tests were established by Merz & Shirts (2018), in which the physical_validation package was first mentioned.

14.
J Phys Chem B ; 126(42): 8427-8438, 2022 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-36223525

RESUMO

Protein tyrosine phosphatases (PTPs) are promising drug targets for treating a wide range of diseases such as diabetes, cancer, and neurological disorders, but their conserved active sites have complicated the design of selective therapeutics. This study examines the allosteric inhibition of PTP1B by amorphadiene (AD), a terpenoid hydrocarbon that is an unusually selective inhibitor. Molecular dynamics (MD) simulations carried out in this study suggest that AD can stably sample multiple neighboring sites on the allosterically influential C-terminus of the catalytic domain. Binding to these sites requires a disordered α7 helix, which stabilizes the PTP1B-AD complex and may contribute to the selectivity of AD for PTP1B over TCPTP. Intriguingly, the binding mode of AD differs from that of the most well-studied allosteric inhibitor of PTP1B. Indeed, biophysical measurements and MD simulations indicate that the two molecules can bind simultaneously. Upon binding, both inhibitors destabilize the α7 helix by disrupting interactions at the α3-α7 interface and prevent the formation of hydrogen bonds that facilitate closure of the catalytically essential WPD loop. These findings indicate that AD is a promising scaffold for building allosteric inhibitors of PTP1B and illustrate, more broadly, how unfunctionalized terpenoids can engage in specific interactions with protein surfaces.


Assuntos
Simulação de Dinâmica Molecular , Terpenos , Terpenos/farmacologia , Domínio Catalítico , Ligação de Hidrogênio , Inibidores Enzimáticos/química
15.
J Chem Theory Comput ; 18(10): 6354-6369, 2022 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-36179376

RESUMO

Non-biological foldamers are a promising class of macromolecules that share similarities to classical biopolymers such as proteins and nucleic acids. Currently, designing novel foldamers is a non-trivial process, often involving many iterations of trial synthesis and characterization until folded structures are observed. In this work, we aim to tackle these foldamer design challenges using computational modeling techniques. We developed CG PyRosetta, an extension to the popular protein folding python package, PyRosetta, which introduces coarse-grained (CG) residues into PyRosetta, enabling the folding of toy CG foldamer models. Although these models are simplified, they can help explore overarching physical hypotheses about how oligomers can form. Through systematic variation of CG parameters in these models, we can investigate various folding hypotheses at the CG scale to inform the design process of new foldamer chemistries. In this study, we demonstrate CG PyRosetta's ability to identify minimum energy structures with a diverse structural search over a range of simple models, as well as two hypothesis-driven parameter scans investigating the effects of side-chain size and internal backbone angle on secondary structures. We are able to identify several types of secondary structures from single- and double-helices to sheet-like and knot-like structures. We show how side-chain size and backbone bond angle both play an important role in the structure and energetics of these toy models. Optimal side-chain sizes promote favorable packing of side chains, while specific backbone bond angles influence the specific helix type found in folded structures.


Assuntos
Ácidos Nucleicos , Dobramento de Proteína , Modelos Moleculares , Estrutura Secundária de Proteína , Proteínas/química
16.
J Comput Aided Mol Des ; 36(4): 313-328, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35507105

RESUMO

Insulin has been commonly adopted as a peptide drug to treat diabetes as it facilitates the uptake of glucose from the blood. The development of oral insulin remains elusive over decades owing to its susceptibility to the enzymes in the gastrointestinal tract and poor permeability through the intestinal epithelium upon dimerization. Recent experimental studies have revealed that certain O-linked glycosylation patterns could enhance insulin's proteolytic stability and reduce its dimerization propensity, but understanding such phenomena at the molecular level is still difficult. To address this challenge, we proposed and tested several structural determinants that could potentially influence insulin's proteolytic stability and dimerization propensity. We used these metrics to assess the properties of interest from [Formula: see text] aggregate molecular dynamics of each of 12 targeted insulin glyco-variants from multiple wild-type crystal structures. We found that glycan-involved hydrogen bonds and glycan-dimer occlusion were useful metrics predicting the proteolytic stability and dimerization propensity of insulin, respectively, as was in part the solvent-accessible surface area of proteolytic sites. However, other plausible metrics were not generally predictive. This work helps better explain how O-linked glycosylation influences the proteolytic stability and monomeric propensity of insulin, illuminating a path towards rational molecular design of insulin glycoforms.


Assuntos
Insulina , Simulação de Dinâmica Molecular , Dimerização , Insulina/análogos & derivados , Insulina/química , Polissacarídeos
17.
J Chem Theory Comput ; 18(6): 3566-3576, 2022 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-35507313

RESUMO

Developing accurate classical force field representations of molecules is key to realizing the full potential of molecular simulations, both as a powerful route to gaining fundamental insights into a broad spectrum of chemical and biological phenomena and for predicting physicochemical and mechanical properties of substances. The Open Force Field Consortium is an industry-funded open science effort to this end, developing open-source tools for rapidly generating new high-quality small-molecule force fields. An integral aspect of this is the parameterization and assessment of force fields against high-quality, condensed-phase physical property data, curated from open data sources such as the NIST ThermoML Archive, alongside quantum chemical data. The quantity of such experimental data in open data archives alone would require an onerous amount of human and computational resources to both curate and estimate manually, especially when estimations must be obtained for numerous sets of force field parameters. Here, we present an entirely automated, highly scalable framework for evaluating physical properties and their gradients in terms of force field parameters. It is written as a modular and extensible Python framework, which employs an intelligent multiscale estimation approach that allows for the automated estimation of properties from simulation and cached simulation data, and a pluggable API for estimation of new properties. In this study, we demonstrate the utility of the framework by benchmarking the OpenFF 1.0.0 small-molecule force field and GAFF 1.8 and GAFF 2.1 force fields against a test set of binary density and enthalpy of mixing measurements curated using the framework utilities. Further, we demonstrate the framework's utility as part of force field optimization by using it alongside ForceBalance, a framework for systematic force field optimization, to retrain a set of nonbonded van der Waals parameters against a training set of density and enthalpy of vaporization measurements.


Assuntos
Termodinâmica , Simulação por Computador , Humanos
18.
J Chem Theory Comput ; 18(6): 3577-3592, 2022 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-35533269

RESUMO

Developing a sufficiently accurate classical force field representation of molecules is key to realizing the full potential of molecular simulations as a route to gaining a fundamental insight into a broad spectrum of chemical and biological phenomena. This is only possible, however, if the many complex interactions between molecules of different species in the system are accurately captured by the model. Historically, the intermolecular van der Waals (vdW) interactions have primarily been trained against densities and enthalpies of vaporization of pure (single-component) systems, with occasional usage of hydration free energies. In this study, we demonstrate how including physical property data of binary mixtures can better inform these parameters, encoding more information about the underlying physics of the system in complex chemical mixtures. To demonstrate this, we retrain a select number of Lennard-Jones parameters describing the vdW interactions of the OpenFF 1.0.0 (Parsley) fixed charge force field against training sets composed of densities and enthalpies of mixing for binary liquid mixtures as well as densities and enthalpies of vaporization of pure liquid systems and assess the performance of each of these combinations. We show that retraining against the mixture data improves the force field's ability to reproduce mixture properties, including solvation free energies, correcting some systematic errors that exist when training vdW interactions against properties of pure systems only.


Assuntos
Termodinâmica
19.
J Chem Inf Model ; 62(4): 874-889, 2022 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-35129974

RESUMO

A high level of physical detail in a molecular model improves its ability to perform high accuracy simulations but can also significantly affect its complexity and computational cost. In some situations, it is worthwhile to add complexity to a model to capture properties of interest; in others, additional complexity is unnecessary and can make simulations computationally infeasible. In this work, we demonstrate the use of Bayesian inference for molecular model selection, using Monte Carlo sampling techniques accelerated with surrogate modeling to evaluate the Bayes factor evidence for different levels of complexity in the two-centered Lennard-Jones + quadrupole (2CLJQ) fluid model. Examining three nested levels of model complexity, we demonstrate that the use of variable quadrupole and bond length parameters in this model framework is justified only for some chemistries. Through this process, we also get detailed information about the distributions and correlation of parameter values, enabling improved parametrization and parameter analysis. We also show how the choice of parameter priors, which encode previous model knowledge, can have substantial effects on the selection of models, penalizing careless introduction of additional complexity. We detail the computational techniques used in this analysis, providing a roadmap for future applications of molecular model selection via Bayesian inference and surrogate modeling.


Assuntos
Teorema de Bayes , Simulação por Computador , Método de Monte Carlo
20.
J Chem Theory Comput ; 17(10): 6262-6280, 2021 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-34551262

RESUMO

We present a methodology for defining and optimizing a general force field for classical molecular simulations, and we describe its use to derive the Open Force Field 1.0.0 small-molecule force field, codenamed Parsley. Rather than using traditional atom typing, our approach is built on the SMIRKS-native Open Force Field (SMIRNOFF) parameter assignment formalism, which handles increases in the diversity and specificity of the force field definition without needlessly increasing the complexity of the specification. Parameters are optimized with the ForceBalance tool, based on reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These quantum reference data are computed and are maintained with QCArchive, an open-source and freely available distributed computing and database software ecosystem. In this initial application of the method, we present essentially a full optimization of all valence parameters and report tests of the resulting force field against compounds and data types outside the training set. These tests show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields, as is accuracy on binding free energies. We find that this initial Parsley force field affords accuracy similar to that of other general force fields when used to calculate relative binding free energies spanning 199 protein-ligand systems. Additionally, the resulting infrastructure allows us to rapidly optimize an entirely new force field with minimal human intervention.


Assuntos
Benchmarking , Petroselinum , Ecossistema , Humanos , Ligantes , Conformação Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...