RESUMEN
PlayMolecule Viewer is a web-based data visualization toolkit designed to streamline the exploration of data resulting from structural bioinformatics or computer-aided drug design efforts. By harnessing state-of-the-art web technologies such as WebAssembly, PlayMolecule Viewer integrates powerful Python libraries directly within the browser environment, which enhances its capabilities to manage multiple types of molecular data. With its intuitive interface, it allows users to easily upload, visualize, select, and manipulate molecular structures and associated data. The toolkit supports a wide range of common structural file formats and offers a variety of molecular representations to cater to different visualization needs. PlayMolecule Viewer is freely accessible at open.playmolecule.org, ensuring accessibility and availability to the scientific community and beyond.
Asunto(s)
Biología Computacional , Programas Informáticos , Estructura MolecularRESUMEN
This letter gives results on improving protein-ligand binding affinity predictions based on molecular dynamics simulations using machine learning potentials with a hybrid neural network potential and molecular mechanics methodology (NNP/MM). We compute relative binding free energies with the Alchemical Transfer Method and validate its performance against established benchmarks and find significant enhancements compared with conventional MM force fields like GAFF2.
Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Ligandos , Termodinámica , Proteínas/química , Unión Proteica , Redes Neurales de la ComputaciónRESUMEN
In recent years, reinforcement learning (RL) has emerged as a valuable tool in drug design, offering the potential to propose and optimize molecules with desired properties. However, striking a balance between capabilities, flexibility, reliability, and efficiency remains challenging due to the complexity of advanced RL algorithms and the significant reliance on specialized code. In this work, we introduce ACEGEN, a comprehensive and streamlined toolkit tailored for generative drug design, built using TorchRL, a modern RL library that offers thoroughly tested reusable components. We validate ACEGEN by benchmarking against other published generative modeling algorithms and show comparable or improved performance. We also show examples of ACEGEN applied in multiple drug discovery case studies. ACEGEN is accessible at https://github.com/acellera/acegen-open and available for use under the MIT license.
Asunto(s)
Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Aprendizaje Automático , Algoritmos , Programas Informáticos , Diseño de FármacosRESUMEN
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMEN
G-protein-coupled receptors (GPCRs) are involved in numerous physiological processes and are the most frequent targets of approved drugs. The explosion in the number of new three-dimensional (3D) molecular structures of GPCRs (3D-GPCRome) over the last decade has greatly advanced the mechanistic understanding and drug design opportunities for this protein family. Molecular dynamics (MD) simulations have become a widely established technique for exploring the conformational landscape of proteins at an atomic level. However, the analysis and visualization of MD simulations require efficient storage resources and specialized software. Here we present GPCRmd (http://gpcrmd.org/), an online platform that incorporates web-based visualization capabilities as well as a comprehensive and user-friendly analysis toolbox that allows scientists from different disciplines to visualize, analyze and share GPCR MD data. GPCRmd originates from a community-driven effort to create an open, interactive and standardized database of GPCR MD simulations.
Asunto(s)
Simulación de Dinámica Molecular , Receptores Acoplados a Proteínas G/química , Programas Informáticos , Metaboloma , Modelos Moleculares , Conformación ProteicaRESUMEN
Predictive modeling of toxicity is a crucial step in the drug discovery pipeline. It can help filter out molecules with a high probability of failing in the early stages of de novo drug design. Thus, several machine learning (ML) models have been developed to predict the toxicity of molecules by combining classical ML techniques or deep neural networks with well-known molecular representations such as fingerprints or 2D graphs. But the more natural, accurate representation of molecules is expected to be defined in physical 3D space like in ab initio methods. Recent studies successfully used equivariant graph neural networks (EGNNs) for representation learning based on 3D structures to predict quantum-mechanical properties of molecules. Inspired by this, we investigated the performance of EGNNs to construct reliable ML models for toxicity prediction. We used the equivariant transformer (ET) model in TorchMD-NET for this. Eleven toxicity data sets taken from MoleculeNet, TDCommons, and ToxBenchmark have been considered to evaluate the capability of ET for toxicity prediction. Our results show that ET adequately learns 3D representations of molecules that can successfully correlate with toxicity activity, achieving good accuracies on most data sets comparable to state-of-the-art models. We also test a physicochemical property, namely, the total energy of a molecule, to inform the toxicity prediction with a physical prior. However, our work suggests that these two properties can not be related. We also provide an attention weight analysis for helping to understand the toxicity prediction in 3D space and thus increase the explainability of the ML model. In summary, our findings offer promising insights considering 3D geometry information via EGNNs and provide a straightforward way to integrate molecular conformers into ML-based pipelines for predicting and investigating toxicity prediction in physical space. We expect that in the future, especially for larger, more diverse data sets, EGNNs will be an essential tool in this domain.
RESUMEN
The accurate prediction of protein-ligand binding affinities is crucial for drug discovery. Alchemical free energy calculations have become a popular tool for this purpose. However, the accuracy and reliability of these methods can vary depending on the methodology. In this study, we evaluate the performance of a relative binding free energy protocol based on the alchemical transfer method (ATM), a novel approach based on a coordinate transformation that swaps the positions of two ligands. The results show that ATM matches the performance of more complex free energy perturbation (FEP) methods in terms of Pearson correlation but with marginally higher mean absolute errors. This study shows that the ATM method is competitive compared to more traditional methods in speed and accuracy and offers the advantage of being applicable with any potential energy function.
Asunto(s)
Simulación de Dinámica Molecular , Termodinámica , Reproducibilidad de los Resultados , Entropía , Unión Proteica , LigandosRESUMEN
Machine learning potentials have emerged as a means to enhance the accuracy of biomolecular simulations. However, their application is constrained by the significant computational cost arising from the vast number of parameters compared with traditional molecular mechanics. To tackle this issue, we introduce an optimized implementation of the hybrid method (NNP/MM), which combines a neural network potential (NNP) and molecular mechanics (MM). This approach models a portion of the system, such as a small molecule, using NNP while employing MM for the remaining system to boost efficiency. By conducting molecular dynamics (MD) simulations on various protein-ligand complexes and metadynamics (MTD) simulations on a ligand, we showcase the capabilities of our implementation of NNP/MM. It has enabled us to increase the simulation speed by â¼5 times and achieve a combined sampling of 1 µs for each complex, marking the longest simulations ever reported for this class of simulations.
Asunto(s)
Simulación de Dinámica Molecular , Redes Neurales de la Computación , Ligandos , Aprendizaje AutomáticoRESUMEN
Deep learning has been successfully applied to structure-based protein-ligand affinity prediction, yet the black box nature of these models raises some questions. In a previous study, we presented KDEEP, a convolutional neural network that predicted the binding affinity of a given protein-ligand complex while reaching state-of-the-art performance. However, it was unclear what this model was learning. In this work, we present a new application to visualize the contribution of each input atom to the prediction made by the convolutional neural network, aiding in the interpretability of such predictions. The results suggest that KDEEP is able to learn meaningful chemistry signals from the data, but it has also exposed the inaccuracies of the current model, serving as a guideline for further optimization of our prediction tools.
Asunto(s)
Redes Neurales de la Computación , Proteínas , Ligandos , Proteínas/químicaRESUMEN
SUMMARY: Virtual screening pipelines are one of the most popular used tools in structure-based drug discovery, since they can can reduce both time and cost associated with experimental assays. Recent advances in deep learning methodologies have shown that these outperform classical scoring functions at discriminating binder protein-ligand complexes. Here, we present BindScope, a web application for large-scale active-inactive classification of compounds based on deep convolutional neural networks. Performance is on a pair with current state-of-the-art pipelines. Users can screen on the order of hundreds of compounds at once and interactively visualize the results. AVAILABILITY AND IMPLEMENTATION: BindScope is available as part of the PlayMolecule.org web application suite. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Descubrimiento de Drogas , Internet , Aprendizaje Profundo , Descubrimiento de Drogas/métodos , Ligandos , Redes Neurales de la ComputaciónRESUMEN
Motivation: Structure-based drug discovery methods exploit protein structural information to design small molecules binding to given protein pockets. This work proposes a purely data driven, structure-based approach for imaging ligands as spatial fields in target protein pockets. We use an end-to-end deep learning framework trained on experimental protein-ligand complexes with the intention of mimicking a chemist's intuition at manually placing atoms when designing a new compound. We show that these models can generate spatial images of ligand chemical properties like occupancy, aromaticity and donor-acceptor matching the protein pocket. Results: The predicted fields considerably overlap with those of unseen ligands bound to the target pocket. Maximization of the overlap between the predicted fields and a given ligand on the Astex diverse set recovers the original ligand crystal poses in 70 out of 85 cases within a threshold of 2 Å RMSD. We expect that these models can be used for guiding structure-based drug discovery approaches. Availability and implementation: LigVoxel is available as part of the PlayMolecule.org molecular web application suite. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Descubrimiento de Drogas , Redes Neurales de la Computación , Proteínas/química , Programas Informáticos , Sitios de Unión , Biología Computacional , Ligandos , Unión Proteica , Conformación ProteicaRESUMEN
The extreme dynamic behavior of intrinsically disordered proteins hinders the development of drug-like compounds capable of modulating them. There are several examples of small molecules that specifically interact with disordered peptides. However, their mechanisms of action are still not well understood. Here, we use extensive molecular dynamics simulations combined with adaptive sampling algorithms to perform free ligand binding studies in the context of intrinsically disordered proteins. We tested this approach in the system composed by the D2 sub-domain of the disordered protein p27 and the small molecule SJ403. The results show several protein-ligand bound states characterized by the establishment of a loosely oriented interaction mediated by a limited number of contacts between the ligand and critical residues of p27. Finally, protein conformations in the bound state are likely to be explored by the isolated protein too, therefore supporting a model where the addition of the small molecule restricts the available conformational space.
Asunto(s)
Proteínas Intrínsecamente Desordenadas , Ligandos , Simulación de Dinámica Molecular , Péptidos , Conformación ProteicaRESUMEN
Cryptic pockets are protein cavities that remain hidden in resolved apo structures and generally require the presence of a co-crystallized ligand to become visible. Finding new cryptic pockets is crucial for structure-based drug discovery to identify new ways of modulating protein activity and thus expand the druggable space. We present here a new method and associated web application leveraging mixed-solvent molecular dynamics (MD) simulations using benzene as a hydrophobic probe to detect cryptic pockets. Our all-atom MD-based workflow was systematically tested on 18 different systems and 5 additional kinases and represents the largest validation study of this kind. CrypticScout identifies benzene probe binding hotspots on a protein surface by mapping probe occupancy, residence time, and the benzene occupancy reweighed by the residence time. The method is presented to the scientific community in a web application available via www.playmolecule.org using a distributed computing infrastructure to perform the simulations.
Asunto(s)
Simulación de Dinámica Molecular , Solventes , Sitios de Unión , Interacciones Hidrofóbicas e Hidrofílicas , LigandosRESUMEN
SkeleDock is a scaffold docking algorithm which uses the structure of a protein-ligand complex as a template to model the binding mode of a chemically similar system. This algorithm was evaluated in the D3R Grand Challenge 4 pose prediction challenge, where it achieved competitive performance. Furthermore, we show that if crystallized fragments of the target ligand are available then SkeleDock can outperform rDock docking software at predicting the binding mode. This Application Note also addresses the capacity of this algorithm to model macrocycles and deal with scaffold hopping. SkeleDock can be accessed at https://playmolecule.org/SkeleDock/.
Asunto(s)
Diseño de Fármacos , Sitios de Unión , Cristalografía por Rayos X , Bases de Datos de Proteínas , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica , Conformación Proteica , TermodinámicaRESUMEN
Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at an atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proved that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features to machine learn the force field. In the present contribution, we build upon the advance of Wang et al. and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learn their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems.
RESUMEN
Chemical space is impractically large, and conventional structure-based virtual screening techniques cannot be used to simply search through the entire space to discover effective bioactive molecules. To address this shortcoming, we propose a generative adversarial network to generate, rather than search, diverse three-dimensional ligand shapes complementary to the pocket. Furthermore, we show that the generated molecule shapes can be decoded using a shape-captioning network into a sequence of SMILES enabling directly the structure-based de novo drug design. We evaluate the quality of the method by both structure- (docking) and ligand-based [quantitative structure-activity relationship (QSAR)] virtual screening methods. For both evaluation approaches, we observed enrichment compared to random sampling from initial chemical space of ZINC drug-like compounds.
Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas , Modelos Químicos , Redes Neurales de la Computación , Proteínas/química , Bibliotecas de Moléculas Pequeñas/química , Humanos , Ligandos , Conformación Molecular , Proteínas/metabolismo , Relación Estructura-Actividad Cuantitativa , Bibliotecas de Moléculas Pequeñas/metabolismoRESUMEN
In this work, we propose a machine learning approach to generate novel molecules starting from a seed compound, its three-dimensional (3D) shape, and its pharmacophoric features. The pipeline draws inspiration from generative models used in image analysis and represents a first example of the de novo design of lead-like molecules guided by shape-based features. A variational autoencoder is used to perturb the 3D representation of a compound, followed by a system of convolutional and recurrent neural networks that generate a sequence of SMILES tokens. The generative design of novel scaffolds and functional groups can cover unexplored regions of chemical space that still possess lead-like properties.
Asunto(s)
Aprendizaje Automático , Preparaciones Farmacéuticas/química , Diseño de Fármacos , Enlace de Hidrógeno , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Conformación Molecular , Estructura Molecular , Relación Estructura-Actividad CuantitativaRESUMEN
Fast and accurate molecular force field (FF) parameterization is still an unsolved problem. Accurate FF are not generally available for all molecules, like novel druglike molecules. While methods based on quantum mechanics (QM) exist to parameterize them with better accuracy, they are computationally expensive and slow, which limits applicability to a small number of molecules. Here, we present an automated FF parameterization method which can utilize either density functional theory (DFT) calculations or approximate QM energies produced by different neural network potentials (NNPs), to obtain improved parameters for molecules. We demonstrate that for the case of torchani-ANI-1x NNP, we can parameterize small molecules in a fraction of time compared with an equivalent parameterization using DFT QM calculations while producing more accurate parameters than FF (GAFF2). We expect our method to be of critical importance in computational structure-based drug discovery (SBDD). The current version is available at PlayMolecule ( www.playmolecule.org ) and implemented in HTMD, allowing to parameterize molecules with different QM and NNP options.
Asunto(s)
Teoría Funcional de la Densidad , Redes Neurales de la Computación , Modelos Moleculares , Conformación MolecularRESUMEN
Drug discovery suffers from high attrition because compounds initially deemed as promising can later show ineffectiveness or toxicity resulting from a poor understanding of their activity profile. In this work, we describe a deep self-normalizing neural network model for the prediction of molecular pathway association and evaluate its performance, showing an AUC ranging from 0.69 to 0.91 on a set of compounds extracted from ChEMBL and from 0.81 to 0.83 on an external data set provided by Novartis. We finally discuss the applicability of the proposed model in the domain of lead discovery. A usable application is available via PlayMolecule.org .
Asunto(s)
Redes Neurales de la Computación , Descubrimiento de Drogas/métodosRESUMEN
The widely expressed G-protein coupled receptors (GPCRs) are versatile signal transducer proteins that are attractive drug targets but structurally challenging to study. GPCRs undergo a number of conformational rearrangements when transitioning from the inactive to the active state but have so far been believed to adopt a fairly conserved inactive conformation. Using 19 Fâ NMR spectroscopy and advanced molecular dynamics simulations we describe a novel inactive state of the adenosine 2A receptor which is stabilised by the aminotriazole antagonist Cmpd-1. We demonstrate that the ligand stabilises a unique conformation of helix V and present data on the putative binding mode of the compound involving contacts to the transmembrane bundle as well as the extracellular loop 2.