Búsqueda | Portal Regional de la BVS

Accelerating drug target inhibitor discovery with a deep generative foundation model.

Chenthamarakshan, Vijil; Hoffman, Samuel C; Owen, C David; Lukacik, Petra; Strain-Damerell, Claire; Fearon, Daren; Malla, Tika R; Tumber, Anthony; Schofield, Christopher J; Duyvesteyn, Helen M E; Dejnirattisai, Wanwisa; Carrique, Loic; Walter, Thomas S; Screaton, Gavin R; Matviiuk, Tetiana; Mojsilovic, Aleksandra; Crain, Jason; Walsh, Martin A; Stuart, David I; Das, Payel.

Sci Adv ; 9(25): eadg7865, 2023 06 23.

Artículo en Inglés | MEDLINE | ID: mdl-37343087

RESUMEN

Inhibitor discovery for emerging drug-target proteins is challenging, especially when target structure or active molecules are unknown. Here, we experimentally validate the broad utility of a deep generative framework trained at-scale on protein sequences, small molecules, and their mutual interactions-unbiased toward any specific target. We performed a protein sequence-conditioned sampling on the generative foundation model to design small-molecule inhibitors for two dissimilar targets: the spike protein receptor-binding domain (RBD) and the main protease from SARS-CoV-2. Despite using only the target sequence information during the model inference, micromolar-level inhibition was observed in vitro for two candidates out of four synthesized for each target. The most potent spike RBD inhibitor exhibited activity against several variants in live virus neutralization assays. These results establish that a single, broadly deployable generative foundation model for accelerated inhibitor discovery is effective and efficient, even in the absence of target structure or binder information.

Asunto(s)

Anticuerpos Antivirales , COVID-19 , Humanos , Anticuerpos Antivirales/química , SARS-CoV-2/metabolismo , Unión Proteica , Secuencia de Aminoácidos

An end-to-end deep learning framework for translating mass spectra to de-novo molecules.

Litsa, Eleni E; Chenthamarakshan, Vijil; Das, Payel; Kavraki, Lydia E.

Commun Chem ; 6(1): 132, 2023 Jun 23.

Artículo en Inglés | MEDLINE | ID: mdl-37353554

RESUMEN

Elucidating the structure of a chemical compound is a fundamental task in chemistry with applications in multiple domains including drug discovery, precision medicine, and biomarker discovery. The common practice for elucidating the structure of a compound is to obtain a mass spectrum and subsequently retrieve its structure from spectral databases. However, these methods fail for novel molecules that are not present in the reference database. We propose Spec2Mol, a deep learning architecture for molecular structure recommendation given mass spectra alone. Spec2Mol is inspired by the Speech2Text deep learning architectures for translating audio signals into text. Our approach is based on an encoder-decoder architecture. The encoder learns the spectra embeddings, while the decoder, pre-trained on a massive dataset of chemical structures for translating between different molecular representations, reconstructs SMILES sequences of the recommended chemical structures. We have evaluated Spec2Mol by assessing the molecular similarity between the recommended structures and the original structure. Our analysis showed that Spec2Mol is able to identify the presence of key molecular substructures from its mass spectrum, and shows on par performance, when compared to existing fragmentation tree methods particularly when test structure information is not available during training or present in the reference database.

A Small Step Toward Generalizability: Training a Machine Learning Scoring Function for Structure-Based Virtual Screening.

Scantlebury, Jack; Vost, Lucy; Carbery, Anna; Hadfield, Thomas E; Turnbull, Oliver M; Brown, Nathan; Chenthamarakshan, Vijil; Das, Payel; Grosjean, Harold; von Delft, Frank; Deane, Charlotte M.

J Chem Inf Model ; 63(10): 2960-2974, 2023 05 22.

Artículo en Inglés | MEDLINE | ID: mdl-37166179

RESUMEN

Over the past few years, many machine learning-based scoring functions for predicting the binding of small molecules to proteins have been developed. Their objective is to approximate the distribution which takes two molecules as input and outputs the energy of their interaction. Only a scoring function that accounts for the interatomic interactions involved in binding can accurately predict binding affinity on unseen molecules. However, many scoring functions make predictions based on data set biases rather than an understanding of the physics of binding. These scoring functions perform well when tested on similar targets to those in the training set but fail to generalize to dissimilar targets. To test what a machine learning-based scoring function has learned, input attribution, a technique for learning which features are important to a model when making a prediction on a particular data point, can be applied. If a model successfully learns something beyond data set biases, attribution should give insight into the important binding interactions that are taking place. We built a machine learning-based scoring function that aimed to avoid the influence of bias via thorough train and test data set filtering and show that it achieves comparable performance on the Comparative Assessment of Scoring Functions, 2016 (CASF-2016) benchmark to other leading methods. We then use the CASF-2016 test set to perform attribution and find that the bonds identified as important by PointVS, unlike those extracted from other scoring functions, have a high correlation with those found by a distance-based interaction profiler. We then show that attribution can be used to extract important binding pharmacophores from a given protein target when supplied with a number of bound structures. We use this information to perform fragment elaboration and see improvements in docking scores compared to using structural information from a traditional, data-based approach. This not only provides definitive proof that the scoring function has learned to identify some important binding interactions but also constitutes the first deep learning-based method for extracting structural information from a target for molecule design.

Asunto(s)

Aprendizaje Automático , Proteínas , Unión Proteica , Ligandos , Proteínas/química , Bases de Datos de Proteínas , Simulación del Acoplamiento Molecular

Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations.

Sharma, Bhanushee; Chenthamarakshan, Vijil; Dhurandhar, Amit; Pereira, Shiranee; Hendler, James A; Dordick, Jonathan S; Das, Payel.

Sci Rep ; 13(1): 4908, 2023 03 25.

Artículo en Inglés | MEDLINE | ID: mdl-36966203

RESUMEN

Explainable machine learning for molecular toxicity prediction is a promising approach for efficient drug development and chemical safety. A predictive ML model of toxicity can reduce experimental cost and time while mitigating ethical concerns by significantly reducing animal and clinical testing. Herein, we use a deep learning framework for simultaneously modeling in vitro, in vivo, and clinical toxicity data. Two different molecular input representations are used; Morgan fingerprints and pre-trained SMILES embeddings. A multi-task deep learning model accurately predicts toxicity for all endpoints, including clinical, as indicated by the area under the Receiver Operator Characteristic curve and balanced accuracy. In particular, pre-trained molecular SMILES embeddings as input to the multi-task model improved clinical toxicity predictions compared to existing models in MoleculeNet benchmark. Additionally, our multitask approach is comprehensive in the sense that it is comparable to state-of-the-art approaches for specific endpoints in in vitro, in vivo and clinical platforms. Through both the multi-task model and transfer learning, we were able to indicate the minimal need of in vivo data for clinical toxicity predictions. To provide confidence and explain the model's predictions, we adapt a post-hoc contrastive explanation method that returns pertinent positive and negative features, which correspond well to known mutagenic and reactive toxicophores, such as unsubstituted bonded heteroatoms, aromatic amines, and Michael receptors. Furthermore, toxicophore recovery by pertinent feature analysis captures more of the in vitro (53%) and in vivo (56%), rather than of the clinical (8%), endpoints, and indeed uncovers a preference in known toxicophore data towards in vitro and in vivo experimental data. To our knowledge, this is the first contrastive explanation, using both present and absent substructures, for predictions of clinical and in vivo molecular toxicity.

Asunto(s)

Aminas , Seguridad Química , Animales , Benchmarking , Desarrollo de Medicamentos , Conocimiento

Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design.

Cao, Yue; Das, Payel; Chenthamarakshan, Vijil; Chen, Pin-Yu; Melnyk, Igor; Shen, Yang.

Proc Mach Learn Res ; 139: 1261-1271, 2021 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-34423306

RESUMEN

Designing novel protein sequences for a desired 3D topological fold is a fundamental yet nontrivial task in protein engineering. Challenges exist due to the complex sequence-fold relationship, as well as the difficulties to capture the diversity of the sequences (therefore structures and functions) within a fold. To overcome these challenges, we propose Fold2Seq, a novel transformer-based generative framework for designing protein sequences conditioned on a specific target fold. To model the complex sequence-structure relationship, Fold2Seq jointly learns a sequence embedding using a transformer and a fold embedding from the density of secondary structural elements in 3D voxels. On test sets with single, high-resolution and complete structure inputs for individual folds, our experiments demonstrate improved or comparable performance of Fold2Seq in terms of speed, coverage, and reliability for sequence design, when compared to existing state-of-the-art methods that include data-driven deep generative models and physics-based RosettaDesign. The unique advantages of fold-based Fold2Seq, in comparison to a structure-based deep model and RosettaDesign, become more evident on three additional real-world challenges originating from low-quality, incomplete, or ambiguous input structures. Source code and data are available at https://github.com/IBM/fold2seq.

Author Correction: Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations.

Das, Payel; Sercu, Tom; Wadhawan, Kahini; Padhi, Inkit; Gehrmann, Sebastian; Cipcigan, Flaviu; Chenthamarakshan, Vijil; Strobelt, Hendrik; Dos Santos, Cicero; Chen, Pin-Yu; Yang, Yi Yan; Tan, Jeremy P K; Hedrick, James; Crain, Jason; Mojsilovic, Aleksandra.

Nat Biomed Eng ; 5(8): 942, 2021 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-34183803

Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations.

Nat Biomed Eng ; 5(6): 613-623, 2021 06.

Artículo en Inglés | MEDLINE | ID: mdl-33707779

RESUMEN

The de novo design of antimicrobial therapeutics involves the exploration of a vast chemical repertoire to find compounds with broad-spectrum potency and low toxicity. Here, we report an efficient computational method for the generation of antimicrobials with desired attributes. The method leverages guidance from classifiers trained on an informative latent space of molecules modelled using a deep generative autoencoder, and screens the generated molecules using deep-learning classifiers as well as physicochemical features derived from high-throughput molecular dynamics simulations. Within 48 days, we identified, synthesized and experimentally tested 20 candidate antimicrobial peptides, of which two displayed high potency against diverse Gram-positive and Gram-negative pathogens (including multidrug-resistant Klebsiella pneumoniae) and a low propensity to induce drug resistance in Escherichia coli. Both peptides have low toxicity, as validated in vitro and in mice. We also show using live-cell confocal imaging that the bactericidal mode of action of the peptides involves the formation of membrane pores. The combination of deep learning and molecular dynamics may accelerate the discovery of potent and selective broad-spectrum antimicrobials.

Asunto(s)

Antibacterianos/farmacología , Péptidos Catiónicos Antimicrobianos/farmacología , Aprendizaje Profundo , Diseño de Fármacos , Descubrimiento de Drogas/métodos , Farmacorresistencia Bacteriana/efectos de los fármacos , Acinetobacter baumannii/efectos de los fármacos , Acinetobacter baumannii/crecimiento & desarrollo , Acinetobacter baumannii/ultraestructura , Secuencia de Aminoácidos , Animales , Antibacterianos/síntesis química , Péptidos Catiónicos Antimicrobianos/síntesis química , Escherichia coli/efectos de los fármacos , Escherichia coli/crecimiento & desarrollo , Escherichia coli/ultraestructura , Femenino , Infecciones por Klebsiella/tratamiento farmacológico , Klebsiella pneumoniae/efectos de los fármacos , Klebsiella pneumoniae/crecimiento & desarrollo , Klebsiella pneumoniae/ultraestructura , Ratones , Ratones Endogámicos BALB C , Pruebas de Sensibilidad Microbiana , Simulación de Dinámica Molecular , Pseudomonas aeruginosa/efectos de los fármacos , Pseudomonas aeruginosa/crecimiento & desarrollo , Pseudomonas aeruginosa/ultraestructura , Staphylococcus aureus/efectos de los fármacos , Staphylococcus aureus/crecimiento & desarrollo , Staphylococcus aureus/ultraestructura , Relación Estructura-Actividad

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA