RESUMO
Efficiently identifying the most important communities and key transition nodes in weighted and unweighted networks is a prevalent problem in a wide range of disciplines. Here, we focus on the optimal clustering using variational kinetic parameters, linked to Markov processes defined on the underlying networks, namely, the slowest relaxation time and the Kemeny constant. We derive novel relations in terms of mean first passage times for optimizing clustering via the Kemeny constant and show that the optimal clustering boundaries have equal round-trip times to the clusters they separate. We also propose an efficient method that first projects the network nodes onto a 1D reaction coordinate and subsequently performs a variational boundary search using a parallel tempering algorithm, where the variational kinetic parameters act as an energy function to be extremized. We find that maximization of the Kemeny constant is effective in detecting communities, while the slowest relaxation time allows for detection of transition nodes. We demonstrate the validity of our method on several test systems, including synthetic networks generated from the stochastic block model and real world networks (Santa Fe Institute collaboration network, a network of co-purchased political books, and a street network of multiple cities in Luxembourg). Our approach is compared with existing clustering algorithms based on modularity and the robust Perron cluster analysis, and the identified transition nodes are compared with different notions of node centrality.
RESUMO
Markov processes are widely used models for investigating kinetic networks. Here, we collate and present a variety of results pertaining to kinetic network models in a unified framework. The aim is to lay out explicit links between several important quantities commonly studied in the field, including mean first passage times (MFPTs), correlation functions, and the Kemeny constant. We provide new insights into (i) a simple physical interpretation of the Kemeny constant, (ii) a relationship to infer equilibrium distributions and rate matrices from measurements of MFPTs, and (iii) a protocol to reduce the dimensionality of kinetic networks based on specific requirements that the MFPTs in the coarse-grained system should satisfy. We prove that this protocol coincides with the one proposed by Hummer and Szabo [J. Phys. Chem. B 119, 9029 (2014)], and it leads to a variational principle for the Kemeny constant. Finally, we introduce a modification of this protocol, which preserves the Kemeny constant. Our work underpinning the theoretical aspects of kinetic networks will be useful in applications including milestoning and path sampling algorithms in molecular simulations.
RESUMO
Markov state models (MSMs) provide some of the simplest mathematical and physical descriptions of dynamical and thermodynamical properties of complex systems. However, typically, the large dimensionality of biological systems studied makes them prohibitively expensive to work in fully Markovian regimes. In this case, coarse graining can be introduced to capture the key dynamical processes-slow degrees of the system-and reduce the dimension of the problem. Here, we introduce several possible options for such Markovian coarse graining, including previously commonly used choices: the local equilibrium and the Hummer Szabo approaches. We prove that the coarse grained lower dimensional MSM satisfies a variational principle with respect to its slowest relaxation time scale. This provides an excellent framework for optimal coarse graining, as previously demonstrated. Here, we show that such optimal coarse graining to two or three states has a simple physical interpretation in terms of mean first passage times and fluxes between the coarse grained states. The results are verified numerically using both analytic test potentials and data from explicit solvent molecular dynamics simulations of pentalanine. This approach of optimizing and interpreting clustering protocols has broad applicability and can be used in time series analysis of large data.
RESUMO
Markov state models (MSMs) are more and more widely used in the analysis of molecular simulations to incorporate multiple trajectories together and obtain more accurate time scale information of the slowest processes in the system. Typically, however, multiple lagtimes are used and analyzed as input parameters, yet convergence with respect to the choice of lagtime is not always possible. Here, we present a simple method for calculating the slowest relaxation time (RT) of the system in the limit of very long lagtimes. Our approach relies on the fact that the second eigenvector's autocorrelation function of the propagator will be approximately single exponential at long lagtimes. This allows us to obtain a simple equation for the behavior of the MSM's relaxation time as a function of the lagtime with only two free parameters, one of these being the RT of the system. We demonstrate that the second parameter is a useful indicator of how Markovian a selected variable is for building the MSM. Fitting this function to data gives a limiting value for the optimal variational RT. Testing this on analytic and molecular dynamics data for Ala5 and umbrella sampling-biased ion channel simulations shows that the function accurately describes the behavior of the RT and furthermore that this RT can improve noticeably the value calculated at the longest accessible lagtime. We compare our RT limit to the hidden Markov model (HMM) approach that typically finds RTs of comparable values. However, HMMs cannot be used in conjunction with biased simulation data, requiring more complex algorithms to construct than MSMs, and the derived RTs are not variational, leading to ambiguity in the choice of lagtime at which to build the HMM.
RESUMO
We show how accurate rates of formation and dissociation of peptide dimers can be calculated using direct transition counting (DTC) from replica-exchange molecular dynamics (REMD) simulations. First, continuous trajectories corresponding to system replicas evolving at different temperatures are used to assign conformational states. Second, we analyze the entire REMD data to calculate the corresponding rates at each temperature directly from the number of transition counts. Finally, we compare the kinetics extracted directly, using the DTC method, with indirect estimations based on trajectory likelihood maximization using short-time propagators and on decay rates of state autocorrelation functions. For systems with relatively low-dimensional intrinsic conformational dynamics, the DTC method is simple to implement and leads to accurate temperature-dependent rates. We apply the DTC rate-extraction method to all-atom REMD simulations of dimerization of amyloid-forming NNQQ tetrapetides in explicit water. In an assessment of the REMD sampling efficiency with respect to standard MD, we find a gain of more than a factor of two at the lowest temperature.
Assuntos
Modelos Químicos , Oligopeptídeos/química , Proteínas Amiloidogênicas/química , Cinética , Simulação de Dinâmica Molecular , Multimerização ProteicaRESUMO
Understanding ligand dissociation mechanisms at an atomic resolution is a highly desired but difficult to achieve objective in experiments as well as in computer simulations. Structural details of the dissociation process are in general hard to capture in experiments, while the relevant time scales are far beyond molecular dynamics (MD) simulations even with the most powerful supercomputers. As such, many different specialized enhanced sampling methods have been proposed that make it possible to efficiently calculate the dissociation mechanisms in protein-ligand systems. However, accurate benchmarks against long unbiased MD simulations are either not reported yet or simply not feasible due to the extremely long time scales. In this manuscript, we consider one such recent method, "infrequent metadynamics", and benchmark in detail the thermodynamic and kinetic information obtained from this method against extensive unbiased MD simulations for the dissociation dynamics of two different millimolar fragments from the protein FKBP in explicit water with residence times in the nanoseconds to microseconds regime. We find that the metadynamics approach gives the same binding free energy profile, dissociation pathway, and ligand residence time as the unbiased MD, albeit using only 6-50 times lower computational resources. Furthermore, we demonstrate how the metadynamics approach can self-consistently be used to ascertain whether the reweighted kinetic constants are reliable or not. We thus conclude that the answer to the question posed in the title of this manuscript is, statistically speaking, yes.
Assuntos
Simulação de Dinâmica Molecular , Preparações Farmacêuticas/química , Termodinâmica , Cinética , LigantesRESUMO
We present a simple approach to calculate the kinetic properties of lipid membrane crossing processes from biased molecular dynamics simulations. We demonstrate that by using biased simulations, one can obtain highly accurate kinetic information with significantly reduced computational time with respect to unbiased simulations. We describe how to conveniently calculate the transition rates to enter, cross, and exit the membrane in terms of the mean first passage times. To obtain free energy barriers and relaxation times from biased simulations only, we constructed Markov models using the dynamic histogram analysis method (DHAM). The permeability coefficients that are calculated from the relaxation times are found to correlate highly with experimentally evaluated values. We show that more generally, certain calculated kinetic properties linked to the crossing of the membrane layer (e.g., barrier height and barrier crossing rates) are good indicators of ordering drugs by permeability. Extending the analysis to a 2D Markov model provides a physical description of the membrane crossing mechanism.
Assuntos
Permeabilidade da Membrana Celular/efeitos dos fármacos , Simulação de Dinâmica Molecular , Clorpromazina/química , Clorpromazina/farmacologia , Desipramina/química , Desipramina/farmacologia , Domperidona/química , Domperidona/farmacologia , Cinética , Labetalol/química , Labetalol/farmacologia , Bicamadas Lipídicas/química , Loperamida/química , Loperamida/farmacologia , Estrutura Molecular , Propranolol/química , Propranolol/farmacologia , Termodinâmica , Verapamil/química , Verapamil/farmacologiaRESUMO
We present an algorithm to calculate free energies and rates from molecular simulations on biased potential energy surfaces. As input, it uses the accumulated times spent in each state or bin of a histogram and counts of transitions between them. Optimal unbiased equilibrium free energies for each of the states/bins are then obtained by maximizing the likelihood of a master equation (i.e., first-order kinetic rate model). The resulting free energies also determine the optimal rate coefficients for transitions between the states or bins on the biased potentials. Unbiased rates can be estimated, e.g., by imposing a linear free energy condition in the likelihood maximization. The resulting "dynamic histogram analysis method extended to detailed balance" (DHAMed) builds on the DHAM method. It is also closely related to the transition-based reweighting analysis method (TRAM) and the discrete TRAM (dTRAM). However, in the continuous-time formulation of DHAMed, the detailed balance constraints are more easily accounted for, resulting in compact expressions amenable to efficient numerical treatment. DHAMed produces accurate free energies in cases where the common weighted-histogram analysis method (WHAM) for umbrella sampling fails because of slow dynamics within the windows. Even in the limit of completely uncorrelated data, where WHAM is optimal in the maximum-likelihood sense, DHAMed results are nearly indistinguishable. We illustrate DHAMed with applications to ion channel conduction, RNA duplex formation, α-helix folding, and rate calculations from accelerated molecular dynamics. DHAMed can also be used to construct Markov state models from biased or replica-exchange molecular dynamics simulations. By using binless WHAM formulated as a numerical minimization problem, the bias factors for the individual states can be determined efficiently in a preprocessing step and, if needed, optimized globally afterward.