Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
J Chem Inf Model ; 64(4): 1123-1133, 2024 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-38335055

RESUMEN

DNA-encoded library (DEL) has proven to be a powerful tool that utilizes combinatorially constructed small molecules to facilitate highly efficient screening experiments. These selection experiments, involving multiple stages of washing, elution, and identification of potent binders via unique DNA barcodes, often generate complex data. This complexity can potentially mask the underlying signals, necessitating the application of computational tools, such as machine learning, to uncover valuable insights. We introduce a compositional deep probabilistic model of DEL data, DEL-Compose, which decomposes molecular representations into their monosynthon, disynthon, and trisynthon building blocks and capitalizes on the inherent hierarchical structure of these molecules by modeling latent reactions between embedded synthons. Additionally, we investigate methods to improve the observation models for DEL count data, such as integrating covariate factors to more effectively account for data noise. Across two popular public benchmark data sets (CA-IX and HRP), our model demonstrates strong performance compared to count baselines, enriches the correct pharmacophores, and offers valuable insights via its intrinsic interpretable structure, thereby providing a robust tool for the analysis of DEL data.


Asunto(s)
ADN , Bibliotecas de Moléculas Pequeñas , Bibliotecas de Moléculas Pequeñas/química , ADN/química , Modelos Estadísticos , Biblioteca de Genes
2.
J Chem Inf Model ; 63(9): 2719-2727, 2023 05 08.
Artículo en Inglés | MEDLINE | ID: mdl-37079427

RESUMEN

DNA-encoded library (DEL) technology has enabled significant advances in hit identification by enabling efficient testing of combinatorially generated molecular libraries. DEL screens measure protein binding affinity though sequencing reads of molecules tagged with unique DNA barcodes that survive a series of selection experiments. Computational models have been deployed to learn the latent binding affinities that are correlated to the sequenced count data; however, this correlation is often obfuscated by various sources of noise introduced in its complicated data-generation process. In order to denoise DEL count data and screen for molecules with good binding affinity, computational models require the correct assumptions in their modeling structure to capture the correct signals underlying the data. Recent advances in DEL models have focused on probabilistic formulations of count data, but existing approaches have thus far been limited to only utilizing 2-D molecule-level representations. We introduce a new paradigm, DEL-Dock, that combines ligand-based descriptors with 3-D spatial information from docked protein-ligand complexes. 3-D spatial information allows our model to learn over the actual binding modality rather than using only structure-based information of the ligand. We show that our model is capable of effectively denoising DEL count data to predict molecule enrichment scores that are better correlated with experimental binding affinity measurements compared to prior works. Moreover, by learning over a collection of docked poses we demonstrate that our model, trained only on DEL data, implicitly learns to perform good docking pose selection without requiring external supervision from expensive-to-source protein crystal structures.


Asunto(s)
ADN , Proteínas , Simulación del Acoplamiento Molecular , Ligandos , Modelos Moleculares , Proteínas/química , ADN/química , Unión Proteica
3.
J Chem Phys ; 149(9): 094106, 2018 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-30195289

RESUMEN

Selection of appropriate collective variables (CVs) for enhancing sampling of molecular simulations remains an unsolved problem in computational modeling. In particular, picking initial CVs is particularly challenging in higher dimensions. Which atomic coordinates or transforms there of from a list of thousands should one pick for enhanced sampling runs? How does a modeler even begin to pick starting coordinates for investigation? This remains true even in the case of simple two state systems and only increases in difficulty for multi-state systems. In this work, we solve the "initial" CV problem using a data-driven approach inspired by the field of supervised machine learning (SML). In particular, we show how the decision functions in SML algorithms can be used as initial CVs (SMLcv ) for accelerated sampling. Using solvated alanine dipeptide and Chignolin mini-protein as our test cases, we illustrate how the distance to the support vector machines' decision hyperplane, the output probability estimates from logistic regression, the outputs from shallow or deep neural network classifiers, and other classifiers may be used to reversibly sample slow structural transitions. We discuss the utility of other SML algorithms that might be useful for identifying CVs for accelerating molecular simulations.

4.
Nat Chem ; 10(9): 903-909, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29988151

RESUMEN

Kinases are ubiquitous enzymes involved in the regulation of critical cellular pathways. However, in silico modelling of the conformational ensembles of these enzymes is difficult due to inherent limitations and the cost of computational approaches. Recent algorithmic advances combined with homology modelling and parallel simulations have enabled researchers to address this computational sampling bottleneck. Here, we present the results of molecular dynamics studies for seven Src family kinase (SFK) members: Fyn, Lyn, Lck, Hck, Fgr, Yes and Blk. We present a sequence invariant extension to Markov state models, which allows us to quantitatively compare the structural ensembles of the seven kinases. Our findings indicate that in the absence of their regulatory partners, SFK members have similar in silico dynamics with active state populations ranging from 4 to 40% and activation timescales in the hundreds of microseconds. Furthermore, we observe several potentially druggable intermediate states, including a pocket next to the adenosine triphosphate binding site that could potentially be targeted via a small-molecule inhibitor.


Asunto(s)
Modelos Biológicos , Familia-src Quinasas/metabolismo , Adenosina Trifosfato/química , Adenosina Trifosfato/metabolismo , Secuencias de Aminoácidos , Sitios de Unión , Cinética , Cadenas de Markov , Simulación de Dinámica Molecular , Inhibidores de Proteínas Quinasas/química , Inhibidores de Proteínas Quinasas/metabolismo , Estructura Terciaria de Proteína , Bibliotecas de Moléculas Pequeñas/química , Bibliotecas de Moléculas Pequeñas/metabolismo , Familia-src Quinasas/antagonistas & inhibidores
5.
Phys Rev E ; 97(6-1): 062412, 2018 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-30011547

RESUMEN

Often the analysis of time-dependent chemical and biophysical systems produces high-dimensional time-series data for which it can be difficult to interpret which individual features are most salient. While recent work from our group and others has demonstrated the utility of time-lagged covariate models to study such systems, linearity assumptions can limit the compression of inherently nonlinear dynamics into just a few characteristic components. Recent work in the field of deep learning has led to the development of the variational autoencoder (VAE), which is able to compress complex datasets into simpler manifolds. We present the use of a time-lagged VAE, or variational dynamics encoder (VDE), to reduce complex, nonlinear processes to a single embedding with high fidelity to the underlying dynamics. We demonstrate how the VDE is able to capture nontrivial dynamics in a variety of examples, including Brownian dynamics and atomistic protein folding. Additionally, we demonstrate a method for analyzing the VDE model, inspired by saliency mapping, to determine what features are selected by the VDE model to describe dynamics. The VDE presents an important step in applying techniques from deep learning to more accurately model and interpret complex biophysics.

6.
J Chem Theory Comput ; 14(4): 1887-1894, 2018 Apr 10.
Artículo en Inglés | MEDLINE | ID: mdl-29529369

RESUMEN

Variational autoencoder frameworks have demonstrated success in reducing complex nonlinear dynamics in molecular simulation to a single nonlinear embedding. In this work, we illustrate how this nonlinear latent embedding can be used as a collective variable for enhanced sampling and present a simple modification that allows us to rapidly perform sampling in multiple related systems. We first demonstrate our method is able to describe the effects of force field changes in capped alanine dipeptide after learning about a model using AMBER99. We further provide a simple extension to variational dynamics encoders that allows the model to be trained in a more efficient manner on larger systems by encoding the outputs of a linear transformation using time-structure based independent component analysis (tICA). Using this technique, we show how such a model trained for one protein, the WW domain, can efficiently be transferred to perform enhanced sampling on a related mutant protein, the GTT mutation. This method shows promise for its ability to rapidly sample related systems using a single transferable collective variable, enabling us to probe the effects of variation in increasingly large systems of biophysical interest.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas/química , Alanina/química , Dipéptidos/química
7.
J Chem Theory Comput ; 14(2): 1071-1082, 2018 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-29253336

RESUMEN

Markov state models (MSMs) are a powerful framework for the analysis of molecular dynamics data sets, such as protein folding simulations, because of their straightforward construction and statistical rigor. The coarse-graining of MSMs into an interpretable number of macrostates is a crucial step for connecting theoretical results with experimental observables. Here we present the minimum variance clustering approach (MVCA) for the coarse-graining of MSMs into macrostate models. The method utilizes agglomerative clustering with Ward's minimum variance objective function, and the similarity of the microstate dynamics is determined using the Jensen-Shannon divergence between the corresponding rows in the MSM transition probability matrix. We first show that MVCA produces intuitive results for a simple tripeptide system and is robust toward long-duration statistical artifacts. MVCA is then applied to two protein folding simulations of the same protein in different force fields to demonstrate that a different number of macrostates is appropriate for each model, revealing a misfolded state present in only one of the simulations. Finally, we show that the same method can be used to analyze a data set containing many MSMs from simulations in different force fields by aggregating them into groups and quantifying their dynamical similarity in the context of force field parameter choices. The minimum variance clustering approach with the Jensen-Shannon divergence provides a powerful tool to group dynamics by similarity, both among model states and among dynamical models themselves.


Asunto(s)
Cadenas de Markov , Simulación de Dinámica Molecular , Proteínas/química , Algoritmos , Pliegue de Proteína
8.
J Phys Chem B ; 122(21): 5291-5299, 2018 05 31.
Artículo en Inglés | MEDLINE | ID: mdl-28938073

RESUMEN

We recently showed that the time-structure-based independent component analysis method from Markov state model literature provided a set of variationally optimal slow collective variables for metadynamics (tICA-metadynamics). In this paper, we extend the methodology toward efficient sampling of related mutants by borrowing ideas from transfer learning methods in machine learning. Our method explicitly assumes that a similar set of slow modes and metastable states is found in both the wild type (baseline) and its mutants. Under this assumption, we describe a few simple techniques using sequence mapping for transferring the slow modes and structural information contained in the wild type simulation to a mutant model for performing enhanced sampling. The resulting simulations can then be reweighted onto the full-phase space using the multistate Bennett acceptance ratio, allowing for thermodynamic comparison against the wild type. We first benchmark our methodology by recapturing alanine dipeptide dynamics across a range of different atomistic force fields, including the polarizable Amoeba force field, after learning a set of slow modes using Amber ff99sb-ILDN. We next extend the method by including structural data from the wild type simulation and apply the technique to recapturing the effects of the GTT mutation on the FIP35 WW domain.

9.
Sci Rep ; 7(1): 15604, 2017 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-29142210

RESUMEN

Bruton tyrosine kinase (BTK) is a key enzyme in B-cell development whose improper regulation causes severe immunodeficiency diseases. Design of selective BTK therapeutics would benefit from improved, in-silico structural modeling of the kinase's solution ensemble. However, this remains challenging due to the immense computational cost of sampling events on biological timescales. In this work, we combine multi-millisecond molecular dynamics (MD) simulations with Markov state models (MSMs) to report on the thermodynamics, kinetics, and accessible states of BTK's kinase domain. Our conformational landscape links the active state to several inactive states, connected via a structurally diverse intermediate. Our calculations predict a kinome-wide conformational plasticity, and indicate the presence of several new potentially druggable BTK states. We further find that the population of these states and the kinetics of their inter-conversion are modulated by protonation of an aspartate residue, establishing the power of MD & MSMs in predicting effects of chemical perturbations.


Asunto(s)
Agammaglobulinemia Tirosina Quinasa/química , Linfocitos B/enzimología , Simulación de Dinámica Molecular , Conformación Proteica , Agammaglobulinemia Tirosina Quinasa/antagonistas & inhibidores , Linfocitos B/química , Simulación por Computador , Humanos , Cinética , Cadenas de Markov , Termodinámica
10.
Biophys J ; 112(1): 10-15, 2017 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-28076801

RESUMEN

MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements.


Asunto(s)
Modelos Estadísticos , Simulación de Dinámica Molecular , Programas Informáticos , Proteína Tirosina Quinasa CSK , Cadenas de Markov , Conformación Proteica , Familia-src Quinasas/química , Familia-src Quinasas/metabolismo
11.
J Chem Phys ; 145(19): 194103, 2016 Nov 21.
Artículo en Inglés | MEDLINE | ID: mdl-27875868

RESUMEN

As molecular dynamics simulations access increasingly longer time scales, complementary advances in the analysis of biomolecular time-series data are necessary. Markov state models offer a powerful framework for this analysis by describing a system's states and the transitions between them. A recently established variational theorem for Markov state models now enables modelers to systematically determine the best way to describe a system's dynamics. In the context of the variational theorem, we analyze ultra-long folding simulations for a canonical set of twelve proteins [K. Lindorff-Larsen et al., Science 334, 517 (2011)] by creating and evaluating many types of Markov state models. We present a set of guidelines for constructing Markov state models of protein folding; namely, we recommend the use of cross-validation and a kinetically motivated dimensionality reduction step for improved descriptions of folding dynamics. We also warn that precise kinetics predictions rely on the features chosen to describe the system and pose the description of kinetic uncertainty across ensembles of models as an open issue.


Asunto(s)
Cadenas de Markov , Simulación de Dinámica Molecular , Pliegue de Proteína , Algoritmos , Cinética , Dominios Proteicos
12.
J Chem Theory Comput ; 10(12): 5217-5223, 2014 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-25516725

RESUMEN

Given the large number of crystal structures and NMR ensembles that have been solved to date, classical molecular dynamics (MD) simulations have become powerful tools in the atomistic study of the kinetics and thermodynamics of biomolecular systems on ever increasing time scales. By virtue of the high-dimensional conformational state space that is explored, the interpretation of large-scale simulations faces difficulties not unlike those in the big data community. We address this challenge by introducing a method called clustering based feature selection (CB-FS) that employs a posterior analysis approach. It combines supervised machine learning (SML) and feature selection with Markov state models to automatically identify the relevant degrees of freedom that separate conformational states. We highlight the utility of the method in the evaluation of large-scale simulations and show that it can be used for the rapid and automated identification of relevant order parameters involved in the functional transitions of two exemplary cell-signaling proteins central to human disease states.

13.
J Phys Chem B ; 117(20): 6217-26, 2013 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-23570540

RESUMEN

The influence of electrostatic interactions on the free energy of proton coupled electron transfer in biomimetic oxomanganese complexes inspired by the oxygen-evolving complex (OEC) of photosystem II (PSII) are investigated. The reported study introduces an enhanced multiconformer continuum electrostatics (MCCE) model, parametrized at the density functional theory (DFT) level with a classical valence model for the oxomanganese core. The calculated pKa's and oxidation midpoint potentials (E(m)'s) match experimental values for eight complexes, indicating that purely electrostatic contributions account for most of the observed couplings between deprotonation and oxidation state transitions. We focus on pKa's of terminal water ligands in [Mn(II/III)(H2O)6](2+/3+) (1), [Mn(III)(P)(H2O)2](3-) (2, P = 5,10,15,20-tetrakis(2,6-dichloro-3-sulfonatophenyl)porphyrinato), [Mn2(IV,IV)(µ-O)2(terpy)2(H2O)2](4+) (3, terpy = 2,2':6',2″-terpyridine), and [Mn3(IV,IV,IV)(µ-O)4(phen)4(H2O)2](4+) (4, phen = 1,10-phenanthroline) and the pKa's of µ-oxo bridges and Mn E(m)'s in [Mn2(µ-O)2(bpy)4] (5, bpy = 2,2'-bipyridyl), [Mn2(µ-O)2(salpn)2] (6, salpn = N,N'-bis(salicylidene)-1,3-propanediamine), [Mn2(µ-O)2(3,5-di(Cl)-salpn)2] (7), and [Mn2(µ-O)2(3,5-di(NO2)-salpn)2] (8). The analysis of complexes 6-8 highlights the strong coupling between electron and proton transfers, with any Mn oxidation lowering the pKa of an oxo bridge by 10.5 ± 0.9 pH units. The model also accounts for changes in the E(m)'s by ligand substituents, such as found in complexes 6-8, due to the electron withdrawing Cl (7) and NO2 (8). The reported study provides the foundation for analysis of electrostatic effects in other oxomanganese complexes and metalloenzymes, where proton coupled electron transfer plays a fundamental role in redox-leveling mechanisms.


Asunto(s)
Materiales Biomiméticos/química , Manganeso/química , Oxígeno/química , Oxígeno/metabolismo , Complejo de Proteína del Fotosistema II/metabolismo , Protones , Electricidad Estática , Transporte de Electrón , Ligandos , Modelos Moleculares , Conformación Molecular , Compuestos Organometálicos/química , Oxidación-Reducción , Teoría Cuántica , Solventes/química , Termodinámica , Agua/química
14.
Proc Natl Acad Sci U S A ; 109(22): E1428-36, 2012 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-22586084

RESUMEN

Protein allosteric pathways are investigated in the imidazole glycerol phosphate synthase heterodimer in an effort to elucidate how the effector (PRFAR, N'-[(5'-phosphoribulosyl)formimino]-5-aminoimidazole-4-carboxamide ribonucleotide) activates glutaminase catalysis at a distance of 25 Å from the glutamine-binding site. We apply solution NMR techniques and community analysis of dynamical networks, based on mutual information of correlated protein motions in the active and inactive enzymes. We find evidence that the allosteric pathways in the PRFAR bound enzyme involve conserved residues that correlate motion of the PRFAR binding loop to motion at the protein-protein interface, and ultimately at the glutaminase active site. The imidazole glycerol phosphate synthase bienzyme is an important branch point for the histidine and nucleotide biosynthetic pathways and represents a potential therapeutic target against microbes. The proposed allosteric mechanism and the underlying allosteric pathways provide fundamental insights for the design of new allosteric drugs and/or alternative herbicides.


Asunto(s)
Regulación Alostérica , Aminohidrolasas/química , Proteínas Bacterianas/química , Thermotoga maritima/enzimología , Algoritmos , Sitio Alostérico , Aminohidrolasas/metabolismo , Proteínas Bacterianas/metabolismo , Sitios de Unión , Biocatálisis , Cristalografía por Rayos X , Imidazoles/química , Imidazoles/metabolismo , Cinética , Modelos Moleculares , Simulación de Dinámica Molecular , Unión Proteica , Conformación Proteica , Multimerización de Proteína , Estructura Terciaria de Proteína , Subunidades de Proteína/química , Subunidades de Proteína/metabolismo , Ribonucleótidos/química , Ribonucleótidos/metabolismo , Transducción de Señal , Thermotoga maritima/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...