Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters











Publication year range
1.
Sci Data ; 8(1): 55, 2021 02 10.
Article in English | MEDLINE | ID: mdl-33568655

ABSTRACT

Advances in computational chemistry create an ongoing need for larger and higher-quality datasets that characterize noncovalent molecular interactions. We present three benchmark collections of quantum mechanical data, covering approximately 3,700 distinct types of interacting molecule pairs. The first collection, which we refer to as DES370K, contains interaction energies for more than 370,000 dimer geometries. These were computed using the coupled-cluster method with single, double, and perturbative triple excitations [CCSD(T)], which is widely regarded as the gold-standard method in electronic structure theory. Our second benchmark collection, a core representative subset of DES370K called DES15K, is intended for more computationally demanding applications of the data. Finally, DES5M, our third collection, comprises interaction energies for nearly 5,000,000 dimer geometries; these were calculated using SNS-MP2, a machine learning approach that provides results with accuracy comparable to that of our coupled-cluster training data. These datasets may prove useful in the development of density functionals, empirically corrected wavefunction-based approaches, semi-empirical methods, force fields, and models trained using machine learning methods.

2.
J Chem Phys ; 147(16): 161725, 2017 Oct 28.
Article in English | MEDLINE | ID: mdl-29096510

ABSTRACT

Noncovalent interactions are of fundamental importance across the disciplines of chemistry, materials science, and biology. Quantum chemical calculations on noncovalently bound complexes, which allow for the quantification of properties such as binding energies and geometries, play an essential role in advancing our understanding of, and building models for, a vast array of complex processes involving molecular association or self-assembly. Because of its relatively modest computational cost, second-order Møller-Plesset perturbation (MP2) theory is one of the most widely used methods in quantum chemistry for studying noncovalent interactions. MP2 is, however, plagued by serious errors due to its incomplete treatment of electron correlation, especially when modeling van der Waals interactions and π-stacked complexes. Here we present spin-network-scaled MP2 (SNS-MP2), a new semi-empirical MP2-based method for dimer interaction-energy calculations. To correct for errors in MP2, SNS-MP2 uses quantum chemical features of the complex under study in conjunction with a neural network to reweight terms appearing in the total MP2 interaction energy. The method has been trained on a new data set consisting of over 200 000 complete basis set (CBS)-extrapolated coupled-cluster interaction energies, which are considered the gold standard for chemical accuracy. SNS-MP2 predicts gold-standard binding energies of unseen test compounds with a mean absolute error of 0.04 kcal mol-1 (root-mean-square error 0.09 kcal mol-1), a 6- to 7-fold improvement over MP2. To the best of our knowledge, its accuracy exceeds that of all extant density functional theory- and wavefunction-based methods of similar computational cost, and is very close to the intrinsic accuracy of our benchmark coupled-cluster methodology itself. Furthermore, SNS-MP2 provides reliable per-conformation confidence intervals on the predicted interaction energies, a feature not available from any alternative method.

3.
PLoS Comput Biol ; 13(7): e1005659, 2017 Jul.
Article in English | MEDLINE | ID: mdl-28746339

ABSTRACT

OpenMM is a molecular dynamics simulation toolkit with a unique focus on extensibility. It allows users to easily add new features, including forces with novel functional forms, new integration algorithms, and new simulation protocols. Those features automatically work on all supported hardware types (including both CPUs and GPUs) and perform well on all of them. In many cases they require minimal coding, just a mathematical description of the desired function. They also require no modification to OpenMM itself and can be distributed independently of OpenMM. This makes it an ideal tool for researchers developing new simulation methods, and also allows those new methods to be immediately available to the larger community.


Subject(s)
Algorithms , Computational Biology/methods , Molecular Dynamics Simulation , Software
4.
J Chem Phys ; 146(4): 044109, 2017 01 28.
Article in English | MEDLINE | ID: mdl-28147508

ABSTRACT

Reaction coordinates are widely used throughout chemical physics to model and understand complex chemical transformations. We introduce a definition of the natural reaction coordinate, suitable for condensed phase and biomolecular systems, as a maximally predictive one-dimensional projection. We then show that this criterion is uniquely satisfied by a dominant eigenfunction of an integral operator associated with the ensemble dynamics. We present a new sparse estimator for these eigenfunctions which can search through a large candidate pool of structural order parameters and build simple, interpretable approximations that employ only a small number of these order parameters. Example applications with a small molecule's rotational dynamics and simulations of protein conformational change and folding show that this approach can filter through statistical noise to identify simple reaction coordinates from complex dynamics.


Subject(s)
Biphenyl Compounds/chemistry , Molecular Dynamics Simulation , Protease Inhibitors/chemistry , Small Molecule Libraries/chemistry , Animals , Cattle , Protease Inhibitors/pharmacology , Trypsin/metabolism
5.
Biophys J ; 112(1): 10-15, 2017 Jan 10.
Article in English | MEDLINE | ID: mdl-28076801

ABSTRACT

MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements.


Subject(s)
Models, Statistical , Molecular Dynamics Simulation , Software , CSK Tyrosine-Protein Kinase , Markov Chains , Protein Conformation , src-Family Kinases/chemistry , src-Family Kinases/metabolism
6.
J Chem Phys ; 145(19): 194103, 2016 Nov 21.
Article in English | MEDLINE | ID: mdl-27875868

ABSTRACT

As molecular dynamics simulations access increasingly longer time scales, complementary advances in the analysis of biomolecular time-series data are necessary. Markov state models offer a powerful framework for this analysis by describing a system's states and the transitions between them. A recently established variational theorem for Markov state models now enables modelers to systematically determine the best way to describe a system's dynamics. In the context of the variational theorem, we analyze ultra-long folding simulations for a canonical set of twelve proteins [K. Lindorff-Larsen et al., Science 334, 517 (2011)] by creating and evaluating many types of Markov state models. We present a set of guidelines for constructing Markov state models of protein folding; namely, we recommend the use of cross-validation and a kinetically motivated dimensionality reduction step for improved descriptions of folding dynamics. We also warn that precise kinetics predictions rely on the features chosen to describe the system and pose the description of kinetic uncertainty across ensembles of models as an open issue.


Subject(s)
Markov Chains , Molecular Dynamics Simulation , Protein Folding , Algorithms , Kinetics , Protein Domains
7.
J Chem Theory Comput ; 12(2): 638-49, 2016 Feb 09.
Article in English | MEDLINE | ID: mdl-26683346

ABSTRACT

We describe a flexible and broadly applicable energy refinement method, "nebterpolation," for identifying and characterizing the reaction events in a molecular dynamics (MD) simulation. The new method is applicable to ab initio simulations with hundreds of atoms containing complex and multimolecular reaction events. A key aspect of nebterpolation is smoothing of the reactive MD trajectory in internal coordinates to initiate the search for the reaction path on the potential energy surface. We apply nebterpolation to analyze the reaction events in an ab initio nanoreactor simulation that discovers new molecules and mechanisms, including a C-C coupling pathway for glycolaldehyde synthesis. We find that the new method, which incorporates information from the MD trajectory that connects reactants with products, produces a dramatically distinct set of minimum energy paths compared to existing approaches that start from information for the reaction end points alone. The energy refinement method described here represents a key component of an emerging simulation paradigm where molecular dynamics simulations are applied to discover the possible reaction mechanisms.

8.
Biophys J ; 109(8): 1528-32, 2015 Oct 20.
Article in English | MEDLINE | ID: mdl-26488642

ABSTRACT

As molecular dynamics (MD) simulations continue to evolve into powerful computational tools for studying complex biomolecular systems, the necessity of flexible and easy-to-use software tools for the analysis of these simulations is growing. We have developed MDTraj, a modern, lightweight, and fast software package for analyzing MD simulations. MDTraj reads and writes trajectory data in a wide variety of commonly used formats. It provides a large number of trajectory analysis capabilities including minimal root-mean-square-deviation calculations, secondary structure assignment, and the extraction of common order parameters. The package has a strong focus on interoperability with the wider scientific Python ecosystem, bridging the gap between MD data and the rapidly growing collection of industry-standard statistical analysis and visualization tools in Python. MDTraj is a powerful and user-friendly software package that simplifies the analysis of MD data and connects these datasets with the modern interactive data science software ecosystem in Python.


Subject(s)
Molecular Dynamics Simulation , Software , Internet
9.
J Chem Phys ; 143(3): 034109, 2015 Jul 21.
Article in English | MEDLINE | ID: mdl-26203016

ABSTRACT

Continuous-time Markov processes over finite state-spaces are widely used to model dynamical processes in many fields of natural and social science. Here, we introduce a maximum likelihood estimator for constructing such models from data observed at a finite time interval. This estimator is dramatically more efficient than prior approaches, enables the calculation of deterministic confidence intervals in all model parameters, and can easily enforce important physical constraints on the models such as detailed balance. We demonstrate and discuss the advantages of these models over existing discrete-time Markov models for the analysis of molecular dynamics simulations.


Subject(s)
Likelihood Functions , Markov Chains , Molecular Dynamics Simulation , Protein Folding , Time Factors , Uncertainty
10.
J Chem Phys ; 142(12): 124105, 2015 Mar 28.
Article in English | MEDLINE | ID: mdl-25833563

ABSTRACT

Markov state models are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing between alternative models has hampered progress, especially for non-experts applying these methods to novel biological systems. Here, we consider cross-validation with a new objective function for estimators of these slow dynamical modes, a generalized matrix Rayleigh quotient (GMRQ), which measures the ability of a rank-m projection operator to capture the slow subspace of the system. It is shown that a variational theorem bounds the GMRQ from above by the sum of the first m eigenvalues of the system's propagator, but that this bound can be violated when the requisite matrix elements are estimated subject to statistical uncertainty. This overfitting can be detected and avoided through cross-validation. These result make it possible to construct Markov state models for protein dynamics in a way that appropriately captures the tradeoff between systematic and statistical errors.


Subject(s)
Models, Molecular , Alanine/chemistry , Algorithms , Cluster Analysis , Computer Simulation , Kinetics , Markov Chains , Peptides/pharmacology
11.
J Phys Chem B ; 118(24): 6475-81, 2014 Jun 19.
Article in English | MEDLINE | ID: mdl-24738580

ABSTRACT

Markov state models provide a powerful framework for the analysis of biomolecular conformation dynamics in terms of their metastable states and transition rates. These models provide both a quantitative and comprehensible description of the long-time scale dynamics of large molecular dynamics with a Master equation and have been successfully used to study protein folding, protein conformational change, and protein-ligand binding. However, to achieve satisfactory performance, existing methodologies often require expert intervention when defining the model's discrete state space. While standard model selection methodologies focus on the minimization of systematic bias and disregard statistical error, we show that by consideration of the states' conditional distribution over conformations, both sources of error can be balanced evenhandedly. Application of techniques that consider both systematic bias and statistical error on two 100 µs molecular dynamics trajectories of the Fip35 WW domain shows agreement with existing techniques based on self-consistency of the model's relaxation time scales with more suitable results in regimes in which those time scale-based techniques encourage overfitting. By removing the need for expert tuning, these methods should reduce modeling bias and lower the barriers to entry in Markov state model construction.


Subject(s)
Markov Chains , Algorithms , Molecular Dynamics Simulation
12.
J Chem Theory Comput ; 9(7): 2900-6, 2013 Jul 09.
Article in English | MEDLINE | ID: mdl-26583974

ABSTRACT

Statistical modeling of long timescale dynamics with Markov state models (MSMs) has been shown to be an effective strategy for building quantitative and qualitative insight into protein folding processes. Existing methodologies, however, rely on geometric clustering using distance metrics such as root mean square deviation (RMSD), assuming that geometric similarity provides an adequate basis for the kinetic partitioning of phase space. Here, inspired by advances in the machine learning community, we introduce a new approach for learning a distance metric explicitly constructed to model kinetic similarity. This approach enables the construction of models, especially in the regime of high anisotropy in the diffusion constant, with fewer states than was previously possible. Application of this technique to the analysis of two ultralong molecular dynamics simulations of the FiP35 WW domain identifies discrete near-native relaxation dynamics in the millisecond regime that were not resolved in previous analyses.

13.
ChemSusChem ; 4(2): 191-6, 2011 Feb 18.
Article in English | MEDLINE | ID: mdl-21328550

ABSTRACT

The reactivity of reduced pyridinium with CO(2) was investigated as a function of catalyst concentration, temperature, and pressure at platinum electrodes. Concentration experiments show that the catalytic current measured by cyclic voltammetry increases linearly with pyridinium and CO(2) concentrations; this indicates that the rate-determining step is first order in both. The formation of a carbamate intermediate is supported by the data presented. Increased electron density at the pyridyl nitrogen upon reduction, as calculated by DFT, favors a Lewis acid/base interaction between the nitrogen and the CO(2). The rate of the known side reaction, pyridinium coupling to form hydrogen, does not vary over the temperature range investigated and had a rate constant of 2.5 M(-1) s(-1). CO(2) reduction followed Arrhenius behavior and the activation energy determined by electrochemical simulation was (69±10) kJ mol(-1).


Subject(s)
Carbon Dioxide/chemistry , Pyridines/chemistry , Algorithms , Catalysis , Electrochemistry , Electrodes , Energy Transfer , Hydrogen/chemistry , Hydrogen-Ion Concentration , Kinetics , Nitrogen/chemistry , Oxidation-Reduction , Semiconductors , Temperature
SELECTION OF CITATIONS
SEARCH DETAIL