Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Nature ; 555(7698): 604-610, 2018 03 28.
Artículo en Inglés | MEDLINE | ID: mdl-29595767

RESUMEN

To plan the syntheses of small organic molecules, chemists use retrosynthesis, a problem-solving technique in which target molecules are recursively transformed into increasingly simpler precursors. Computer-aided retrosynthesis would be a valuable tool but at present it is slow and provides results of unsatisfactory quality. Here we use Monte Carlo tree search and symbolic artificial intelligence (AI) to discover retrosynthetic routes. We combined Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on essentially all reactions ever published in organic chemistry. Our system solves for almost twice as many molecules, thirty times faster than the traditional computer-aided search method, which is based on extracted rules and hand-designed heuristics. In a double-blind AB test, chemists on average considered our computer-generated routes to be equivalent to reported literature routes.


Asunto(s)
Inteligencia Artificial , Técnicas de Química Sintética/métodos , Redes Neurales de la Computación , Química Orgánica/métodos , Método de Montecarlo
2.
J Chem Inf Model ; 63(15): 4497-4504, 2023 08 14.
Artículo en Inglés | MEDLINE | ID: mdl-37487018

RESUMEN

Machine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound prioritization. However, different setups and frameworks and the large number of molecular representations make it difficult to properly evaluate, reproduce, and compare them. Here we present a new PREdictive modeling FramEwoRk for molecular discovery (PREFER), written in Python (version 3.7.7) and based on AutoSklearn (version 0.14.7), that allows comparison between different molecular representations and common machine-learning models. We provide an overview of the design of our framework and show exemplary use cases and results of several representation-model combinations on diverse data sets, both public and in-house. Finally, we discuss the use of PREFER on small data sets. The code of the framework is freely available on GitHub.


Asunto(s)
Quimioinformática , Aprendizaje Automático
3.
J Chem Inf Model ; 62(10): 2293-2300, 2022 05 23.
Artículo en Inglés | MEDLINE | ID: mdl-35452226

RESUMEN

De novo molecule design algorithms often result in chemically unfeasible or synthetically inaccessible molecules. A natural idea to mitigate this problem is to bias these algorithms toward more easily synthesizable molecules using a proxy score for synthetic accessibility. However, using currently available proxies can still result in highly unrealistic compounds. Here, we propose a novel approach, RetroGNN, to estimate synthesizability. First, we search for routes using synthesis planning software for a large number of random molecules. This information is then used to train a graph neural network to predict the outcome of the synthesis planner given the target molecule, in which the regression task can be used as a synthesizability scorer. We highlight how RetroGNN can be used in generative molecule-discovery pipelines together with other scoring functions. We evaluate our approach on several QSAR-based molecule design benchmarks, for which we find synthesizable molecules with state-of-the-art scores. Compared to the virtual screening of 5 million existing molecules from the ZINC database, using RetroGNNScore with a simple fragment-based de novo design algorithm finds molecules predicted to be more likely to possess the desired activity exponentially faster, while maintaining good druglike properties and being easier to synthesize. Importantly, our deep neural network can successfully filter out hard to synthesize molecules while achieving a 105 times speedup over using retrosynthesis planning software.


Asunto(s)
Diseño de Fármacos , Programas Informáticos , Algoritmos , Redes Neurales de la Computación
4.
J Chem Inf Model ; 62(9): 2111-2120, 2022 05 09.
Artículo en Inglés | MEDLINE | ID: mdl-35034452

RESUMEN

Finding synthesis routes for molecules of interest is essential in the discovery of new drugs and materials. To find such routes, computer-assisted synthesis planning (CASP) methods are employed, which rely on a single-step model of chemical reactivity. In this study, we introduce a template-based single-step retrosynthesis model based on Modern Hopfield Networks, which learn an encoding of both molecules and reaction templates in order to predict the relevance of templates for a given molecule. The template representation allows generalization across different reactions and significantly improves the performance of template relevance prediction, especially for templates with few or zero training examples. With inference speed up to orders of magnitude faster than baseline methods, we improve or match the state-of-the-art performance for top-k exact match accuracy for k ≥ 3 in the retrosynthesis benchmark USPTO-50k. Code to reproduce the results is available at github.com/ml-jku/mhn-react.

5.
Chem Soc Rev ; 49(17): 6154-6168, 2020 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-32672294

RESUMEN

Machine learning (ML) has emerged as a general, problem-solving paradigm with many applications in computer vision, natural language processing, digital safety, or medicine. By recognizing complex patterns in data, ML bears the potential to modernise the way how many chemical challenges are approached. In this review, an introduction to ML is given from the perspective of synthetic chemistry: starting from the fundamentals regarding algorithms and best-practice workflows, the review covers different applications of machine learning in synthesis planning, property prediction, molecular design, and reactivity prediction. In particular, different approaches of representing and utilizing organic molecules will be discussed - providing synthetic chemists both with the understanding and the tools required to apply machine learning in the context of their research, and pointers for further studying.

6.
J Chem Inf Model ; 59(3): 1096-1108, 2019 03 25.
Artículo en Inglés | MEDLINE | ID: mdl-30887799

RESUMEN

De novo design seeks to generate molecules with required property profiles by virtual design-make-test cycles. With the emergence of deep learning and neural generative models in many application areas, models for molecular design based on neural networks appeared recently and show promising results. However, the new models have not been profiled on consistent tasks, and comparative studies to well-established algorithms have only seldom been performed. To standardize the assessment of both classical and neural models for de novo molecular design, we propose an evaluation framework, GuacaMol, based on a suite of standardized benchmarks. The benchmark tasks encompass measuring the fidelity of the models to reproduce the property distribution of the training sets, the ability to generate novel molecules, the exploration and exploitation of chemical space, and a variety of single and multiobjective optimization tasks. The benchmarking open-source Python code and a leaderboard can be found on https://benevolent.ai/guacamol .


Asunto(s)
Benchmarking/métodos , Aprendizaje Profundo , Preparaciones Farmacéuticas/química , Diseño de Fármacos , Isomerismo , Modelos Moleculares , Estructura Molecular , Método de Montecarlo , Relación Estructura-Actividad Cuantitativa
7.
Chemistry ; 23(25): 5966-5971, 2017 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-28134452

RESUMEN

Reaction prediction and retrosynthesis are the cornerstones of organic chemistry. Rule-based expert systems have been the most widespread approach to computationally solve these two related challenges to date. However, reaction rules often fail because they ignore the molecular context, which leads to reactivity conflicts. Herein, we report that deep neural networks can learn to resolve reactivity conflicts and to prioritize the most suitable transformation rules. We show that by training our model on 3.5 million reactions taken from the collective published knowledge of the entire discipline of chemistry, our model exhibits a top10-accuracy of 95 % in retrosynthesis and 97 % for reaction prediction on a validation set of almost 1 million reactions.

8.
Chemistry ; 23(25): 6118-6128, 2017 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-27862477

RESUMEN

The ability to reason beyond established knowledge allows organic chemists to solve synthetic problems and invent novel transformations. Herein, we propose a model that mimics chemical reasoning, and formalises reaction prediction as finding missing links in a knowledge graph. We have constructed a knowledge graph containing 14.4 million molecules and 8.2 million binary reactions, which represents the bulk of all chemical reactions ever published in the scientific literature. Our model outperforms a rule-based expert system in the reaction prediction task for 180 000 randomly selected binary reactions. The data-driven model generalises even beyond known reaction types, and is thus capable of effectively (re-)discovering novel transformations (even including transition metal-catalysed reactions). Our model enables computers to infer hypotheses about reactivity and reactions by only considering the intrinsic local structure of the graph and because each single reaction prediction is typically achieved in a sub-second time frame, the model can be used as a high-throughput generator of reaction hypotheses for reaction discovery.

9.
Chemistry ; 21(34): 12053-60, 2015 Aug 17.
Artículo en Inglés | MEDLINE | ID: mdl-26212677

RESUMEN

N-carbamoyl nitrones represent an important class of reagents for the synthesis of a variety of natural and biologically active compounds. These compounds are generally converted into valuable 4-isoxazolines upon cyclization reaction with dipolarophiles. However, these types of N-protected nitrones are highly unstable, which limits their synthesis, storage and practical use, enforcing alternative lengthy or elaborated synthetic routes. In this work, a 2,2,6,6-tetramethylpiperidin-1-oxyl (TEMPO)-mediated formal "dehydrogenation" of N-protected benzyl-, allyl- and alkyl-substituted hydroxylamines followed by in situ trapping of the generated unstable nitrones into N-carbamoyl 4-isoxazolines is presented. A plausible mechanism is also proposed, in which the dipolarophile shows an important assistant role in the generation of the active nitrone intermediate. This simple protocol avoids the problematic isolation of N-carbamoyl protected nitrones, providing new synthetic possibilities in isoxazoline chemistry.

10.
Curr Opin Struct Biol ; 82: 102658, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37473637

RESUMEN

Computational techniques, including virtual screening, de novo design, and generative models, play an increasing role in expediting DMTA cycles for modern molecular discovery. However, computationally proposed molecules must be synthetically feasible for laboratory testing. In this perspective, we offer a succinct introduction to the subject, and showcase typical workflows to integrate synthesis planning, synthesizability scoring, and molecule generation. Finally, we address limitations and opportunities for future research.

11.
Nat Commun ; 14(1): 6651, 2023 10 31.
Artículo en Inglés | MEDLINE | ID: mdl-37907461

RESUMEN

The lead optimization process in drug discovery campaigns is an arduous endeavour where the input of many medicinal chemists is weighed in order to reach a desired molecular property profile. Building the expertise to successfully drive such projects collaboratively is a very time-consuming process that typically spans many years within a chemist's career. In this work we aim to replicate this process by applying artificial intelligence learning-to-rank techniques on feedback that was obtained from 35 chemists at Novartis over the course of several months. We exemplify the usefulness of the learned proxies in routine tasks such as compound prioritization, motif rationalization, and biased de novo drug design. Annotated response data is provided, and developed models and code made available through a permissive open-source license.


Asunto(s)
Inteligencia Artificial , Química Farmacéutica , Química Farmacéutica/métodos , Intuición , Descubrimiento de Drogas/métodos , Diseño de Fármacos , Aprendizaje Automático
12.
Nat Rev Drug Discov ; 22(11): 895-916, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37697042

RESUMEN

Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature. We also discuss how to address key challenges in realizing the potential of these synergies, such as the need for high-quality datasets to train deep learning algorithms and appropriate strategies for algorithm validation.


Asunto(s)
Inteligencia Artificial , Productos Biológicos , Humanos , Algoritmos , Aprendizaje Automático , Descubrimiento de Drogas , Diseño de Fármacos , Productos Biológicos/farmacología
13.
Nat Rev Chem ; 6(6): 428-442, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37117429

RESUMEN

Machine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (or the impossibility of) comparing and assessing the relevance of new algorithms. Ultimately, this may delay the digitalization of chemistry at scale and confuse method developers, experimentalists, reviewers and journal editors. In this Perspective, we critically discuss a set of method development and evaluation guidelines for different types of ML-based publications, emphasizing supervised learning. We provide a diverse collection of examples from various authors and disciplines in chemistry. While taking into account varying accessibility across research groups, our recommendations focus on reporting completeness and standardizing comparisons between tools. We aim to further contribute to improved ML transparency and credibility by suggesting a checklist of retro-/prospective tests and dissecting their importance. We envisage that the wide adoption and continuous update of best practices will encourage an informed use of ML on real-world problems related to the chemical sciences.

14.
J Org Chem ; 76(6): 1945-8, 2011 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-21332240

RESUMEN

The Ag-catalyzed 1,3-dipolar cycloaddition of (Ε)-ß-borylacrylates with azomethine ylides is described. The resulting 3-borylpyrrolidine derivatives were obtained in high yields and complete endo selectivities using AgOAc/dppe as catalyst system and B(dam) as boryl group. Transformation of the B(dam) group into pinacol borane and oxidation afforded 3-hydroxyproline derivatives in high yields.


Asunto(s)
Acrilatos/química , Compuestos Azo/química , Plata/química , Tiosemicarbazonas/química , Catálisis , Estereoisomerismo , Especificidad por Sustrato
15.
ACS Cent Sci ; 4(1): 120-131, 2018 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-29392184

RESUMEN

In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active toward a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery.

16.
J R Soc Interface ; 15(141)2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29618526

RESUMEN

Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.


Asunto(s)
Investigación Biomédica/tendencias , Tecnología Biomédica/tendencias , Aprendizaje Profundo/tendencias , Algoritmos , Investigación Biomédica/métodos , Toma de Decisiones , Atención a la Salud/métodos , Atención a la Salud/tendencias , Enfermedad/genética , Diseño de Fármacos , Registros Electrónicos de Salud/tendencias , Humanos , Terminología como Asunto
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA