Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
ACS Omega ; 8(25): 23148-23167, 2023 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-37396211

RESUMEN

Molecular generative artificial intelligence is drawing significant attention in the drug design community, with several experimentally validated proof of concepts already published. Nevertheless, generative models are known for sometimes generating unrealistic, unstable, unsynthesizable, or uninteresting structures. This calls for methods to constrain those algorithms to generate structures in drug-like portions of the chemical space. While the concept of applicability domains for predictive models is well studied, its counterpart for generative models is not yet well-defined. In this work, we empirically examine various possibilities and propose applicability domains suited for generative models. Using both public and internal data sets, we use generative methods to generate novel structures that are predicted to be actives by a corresponding quantitative structure-activity relationships model while constraining the generative model to stay within a given applicability domain. Our work looks at several applicability domain definitions, combining various criteria, such as structural similarity to the training set, similarity of physicochemical properties, unwanted substructures, and quantitative estimate of drug-likeness. We assess the structures generated from both qualitative and quantitative points of view and find that the applicability domain definitions have a strong influence on the drug-likeness of generated molecules. An extensive analysis of our results allows us to identify applicability domain definitions that are best suited for generating drug-like molecules with generative models. We anticipate that this work will help foster the adoption of generative models in an industrial context.

2.
J Am Chem Soc ; 144(32): 14722-14730, 2022 08 17.
Artículo en Inglés | MEDLINE | ID: mdl-35939717

RESUMEN

Synthetic yield prediction using machine learning is intensively studied. Previous work has focused on two categories of data sets: high-throughput experimentation data, as an ideal case study, and data sets extracted from proprietary databases, which are known to have a strong reporting bias toward high yields. However, predicting yields using published reaction data remains elusive. To fill the gap, we built a data set on nickel-catalyzed cross-couplings extracted from organic reaction publications, including scope and optimization information. We demonstrate the importance of including optimization data as a source of failed experiments and emphasize how publication constraints shape the exploration of the chemical space by the synthetic community. While machine learning models still fail to perform out-of-sample predictions, this work shows that adding chemical knowledge enables fair predictions in a low-data regime. Eventually, we hope that this unique public database will foster further improvements of machine learning methods for reaction yield prediction in a more realistic context.


Asunto(s)
Aprendizaje Automático , Níquel , Catálisis
3.
J Cheminform ; 14(1): 20, 2022 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-35365218

RESUMEN

Despite growing interest and success in automated in-silico molecular design, questions remain regarding the ability of goal-directed generation algorithms to perform unbiased exploration of novel chemical spaces. A specific phenomenon has recently been highlighted: goal-directed generation guided with machine learning models produce molecules with high scores according to the optimization model, but low scores according to control models, even when trained on the same data distribution and the same target. In this work, we show that this worrisome behavior is actually due to issues with the predictive models and not the goal-directed generation algorithms. We show that with appropriate predictive models, this issue can be resolved, and molecules generated have high scores according to both the optimization and the control models.

5.
J Chem Inf Model ; 60(12): 5637-5646, 2020 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-33301333

RESUMEN

One of the major applications of generative models for drug discovery targets the lead-optimization phase. During the optimization of a lead series, it is common to have scaffold constraints imposed on the structure of the molecules designed. Without enforcing such constraints, the probability of generating molecules with the required scaffold is extremely low and hinders the practicality of generative models for de novo drug design. To tackle this issue, we introduce a new algorithm, named SAMOA (Scaffold Constrained Molecular Generation), to perform scaffold-constrained in silico molecular design. We build on the well-known SMILES-based Recurrent Neural Network (RNN) generative model, with a modified sampling procedure to achieve scaffold-constrained generation. We directly benefit from the associated reinforcement learning methods, allowing to design molecules optimized for different properties while exploring only the relevant chemical space. We showcase the method's ability to perform scaffold-constrained generation on various tasks: designing novel molecules around scaffolds extracted from SureChEMBL chemical series, generating novel active molecules on the Dopamine Receptor D2 (DRD2) target, and finally, designing predicted actives on the MMP-12 series, an industrial lead-optimization project.


Asunto(s)
Diseño de Fármacos , Redes Neurales de la Computación , Algoritmos , Descubrimiento de Drogas , Probabilidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...