Pesquisa | Portal Regional da BVS

1.

Artificial Intelligence for Retrosynthetic Planning Needs Both Data and Expert Knowledge.

Strieth-Kalthoff, Felix; Szymkuc, Sara; Molga, Karol; Aspuru-Guzik, Alán; Glorius, Frank; Grzybowski, Bartosz A.

J Am Chem Soc ; 2024 Apr 10.

Artigo em Inglês | MEDLINE | ID: mdl-38598363

RESUMO

Rapid advancements in artificial intelligence (AI) have enabled breakthroughs across many scientific disciplines. In organic chemistry, the challenge of planning complex multistep chemical syntheses should conceptually be well-suited for AI. Yet, the development of AI synthesis planners trained solely on reaction-example-data has stagnated and is not on par with the performance of "hybrid" algorithms combining AI with expert knowledge. This Perspective examines possible causes of these shortcomings, extending beyond the established reasoning of insufficient quantities of reaction data. Drawing attention to the intricacies and data biases that are specific to the domain of synthetic chemistry, we advocate augmenting the unique capabilities of AI with the knowledge base and the reasoning strategies of domain experts. By actively involving synthetic chemists, who are the end users of any synthesis planning software, into the development process, we envision to bridge the gap between computer algorithms and the intricate nature of chemical synthesis.

2.

Chemist Ex Machina: Advanced Synthesis Planning by Computers.

Molga, Karol; Szymkuc, Sara; Grzybowski, Bartosz A.

Acc Chem Res ; 54(5): 1094-1106, 2021 03 02.

Artigo em Inglês | MEDLINE | ID: mdl-33423460

RESUMO

Teaching computers to plan multistep syntheses of arbitrary target molecules-including natural products-has been one of the oldest challenges in chemistry, dating back to the 1960s. This Account recapitulates two decades of our group's work on the software platform called Chematica, which very recently achieved this long-sought objective and has been shown capable of planning synthetic routes to complex natural products, several of which were validated in the laboratory.For the machine to plan syntheses at an expert level, it must know the rules describing chemical reactions and use these rules to expand and search the networks of synthetic options. The rules must be of high quality: They must delineate accurately the scope of admissible substituents, capture all relevant stereochemical information, detect potential reactivity conflicts, and protection requirements. They should yield only those synthons that are chemically stable and energetically allowed (e.g., not too strained) and should be able to extrapolate beyond examples already published in the literature. In parallel, the network-search algorithms must be able to assign meaningful scores to the sets of synthons they encounter, make judicious choices which of the network's branches to expand, and when to withdraw from unpromising ones. They must be able to strategize over multiple steps to resolve intermittent reactivity conflicts, exchange functional groups, or overcome local maxima of molecular complexity.Meeting all these requirements makes the problem of computer-driven retrosynthesis very multifaceted, combining expert and AI approaches further supplemented by quantum-mechanical and molecular-mechanics calculations. Development of Chematica has been a very long and gradual process because all these components are needed. Any shortcuts-for example, reliance on only expert or only data-based approaches-yield chemically naïve and often erroneous syntheses, especially for complex targets. On the bright side, once all the requisite algorithms are implemented-as they now are-they not only streamline conventional synthetic planning but also enable completely new modalities that would challenge any human chemist, for example, synthesis with multiple constraints imposed simultaneously or library-wide syntheses in which the machine constructs "global plans" leading to multiple targets and benefiting from the use of common intermediates. These types of analyses will have profound impact on the practice of chemical industry, designing more economical, more green, and less hazardous pathways.

3.

Computer-generated "synthetic contingency" plans at times of logistics and supply problems: scenarios for hydroxychloroquine and remdesivir.

Szymkuc, Sara; Gajewska, Ewa P; Molga, Karol; Wolos, Agnieszka; Roszak, Rafal; Beker, Wiktor; Moskal, Martyna; Dittwald, Piotr; Grzybowski, Bartosz A.

Chem Sci ; 11(26): 6736-6744, 2020 Jul 14.

Artigo em Inglês | MEDLINE | ID: mdl-33033595

RESUMO

A computer program for retrosynthetic planning helps develop multiple "synthetic contingency" plans for hydroxychloroquine and also routes leading to remdesivir, both promising but yet unproven medications against COVID-19. These plans are designed to navigate, as much as possible, around known and patented routes and to commence from inexpensive and diverse starting materials, so as to ensure supply in case of anticipated market shortages of commonly used substrates. Looking beyond the current COVID-19 pandemic, development of similar contingency syntheses is advocated for other already-approved medications, in case such medications become urgently needed in mass quantities to face other public-health emergencies.

4.

Computational planning of the synthesis of complex natural products.

Mikulak-Klucznik, Barbara; Golebiowska, Patrycja; Bayly, Alison A; Popik, Oskar; Klucznik, Tomasz; Szymkuc, Sara; Gajewska, Ewa P; Dittwald, Piotr; Staszewska-Krajewska, Olga; Beker, Wiktor; Badowski, Tomasz; Scheidt, Karl A; Molga, Karol; Mlynarski, Jacek; Mrksich, Milan; Grzybowski, Bartosz A.

Nature ; 588(7836): 83-88, 2020 12.

Artigo em Inglês | MEDLINE | ID: mdl-33049755

RESUMO

Training algorithms to computationally plan multistep organic syntheses has been a challenge for more than 50 years1-7. However, the field has progressed greatly since the development of early programs such as LHASA1,7, for which reaction choices at each step were made by human operators. Multiple software platforms6,8-14 are now capable of completely autonomous planning. But these programs 'think' only one step at a time and have so far been limited to relatively simple targets, the syntheses of which could arguably be designed by human chemists within minutes, without the help of a computer. Furthermore, no algorithm has yet been able to design plausible routes to complex natural products, for which much more far-sighted, multistep planning is necessary15,16 and closely related literature precedents cannot be relied on. Here we demonstrate that such computational synthesis planning is possible, provided that the program's knowledge of organic chemistry and data-based artificial intelligence routines are augmented with causal relationships17,18, allowing it to 'strategize' over multiple synthetic steps. Using a Turing-like test administered to synthesis experts, we show that the routes designed by such a program are largely indistinguishable from those designed by humans. We also successfully validated three computer-designed syntheses of natural products in the laboratory. Taken together, these results indicate that expert-level automated synthetic planning is feasible, pending continued improvements to the reaction knowledge base and further code optimization.

Assuntos

Inteligência Artificial , Produtos Biológicos/síntese química , Técnicas de Química Sintética/métodos , Química Orgânica/métodos , Software , Inteligência Artificial/normas , Automação/métodos , Automação/normas , Benzilisoquinolinas/síntese química , Benzilisoquinolinas/química , Técnicas de Química Sintética/normas , Química Orgânica/normas , Indanos/síntese química , Indanos/química , Alcaloides Indólicos/síntese química , Alcaloides Indólicos/química , Bases de Conhecimento , Lactonas/síntese química , Lactonas/química , Macrolídeos/síntese química , Macrolídeos/química , Reprodutibilidade dos Testes , Sesquiterpenos/síntese química , Sesquiterpenos/química , Software/normas , Tetra-Hidroisoquinolinas/síntese química , Tetra-Hidroisoquinolinas/química

5.

Synergy Between Expert and Machine-Learning Approaches Allows for Improved Retrosynthetic Planning.

Badowski, Tomasz; Gajewska, Ewa P; Molga, Karol; Grzybowski, Bartosz A.

Angew Chem Int Ed Engl ; 59(2): 725-730, 2020 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-31750610

RESUMO

When computers plan multistep syntheses, they can rely either on expert knowledge or information machine-extracted from large reaction repositories. Both approaches suffer from imperfect functions evaluating reaction choices: expert functions are heuristics based on chemical intuition, whereas machine learning (ML) relies on neural networks (NNs) that can make meaningful predictions only about popular reaction types. This paper shows that expert and ML approaches can be synergistic-specifically, when NNs are trained on literature data matched onto high-quality, expert-coded reaction rules, they achieve higher synthetic accuracy than either of the methods alone and, importantly, can also handle rare/specialized reaction types.

6.

Rapid and Accurate Prediction of pK_a Values of C-H Acids Using Graph Convolutional Neural Networks.

Roszak, Rafal; Beker, Wiktor; Molga, Karol; Grzybowski, Bartosz A.

J Am Chem Soc ; 141(43): 17142-17149, 2019 10 30.

Artigo em Inglês | MEDLINE | ID: mdl-31633925

RESUMO

The ability to estimate the acidity of C-H groups within organic molecules in non-aqueous solvents is important in synthetic planning to correctly predict which protons will be abstracted in reactions such as alkylations, Michael additions, or aldol condensations. This Article describes the use of the so-called graph convolutional neural networks (GCNNs) to perform such predictions on the time scales of milliseconds and with accuracy comparing favorably with state-of-the-art solutions, including commercial ones. The crux of the method is to train GCNNs using descriptors that reflect not only topological but also chemical properties of atomic environments. The model is validated against adversarial controls, supplemented by the discussion of realistic synthetic problems (on which it correctly predicts the most acidic protons in >90% of cases), and accompanied by a Web application intended to aid the community in everyday synthetic planning.

7.

Selection of cost-effective yet chemically diverse pathways from the networks of computer-generated retrosynthetic plans.

Badowski, Tomasz; Molga, Karol; Grzybowski, Bartosz A.

Chem Sci ; 10(17): 4640-4651, 2019 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-31123574

RESUMO

As the programs for computer-aided retrosynthetic design come of age, they are no longer identifying just one or few synthetic routes but a multitude of chemically plausible syntheses, together forming large, directed graphs of solutions. An important problem then emerges: how to select from these graphs and present to the user manageable numbers of top-scoring pathways that are cost-effective, promote convergent vs. linear solutions, and are chemically diverse so that they do not repeat only minor variations in the same chemical theme. This paper describes a family of reaction network algorithms that address this problem by (i) using recursive formulae to assign realistic prices to individual pathways and (ii) applying penalties to chemically similar strategies so that they are not dominating the top-scoring routes. Synthetic examples are provided to illustrate how these algorithms can be implemented - on the timescales of â¼1 s even for large graphs - to rapidly query the space of synthetic solutions under the scenarios of different reaction yields and/or costs associated with performing reaction operations on different scales.

8.

Computational design of syntheses leading to compound libraries or isotopically labelled targets.

Molga, Karol; Dittwald, Piotr; Grzybowski, Bartosz A.

Chem Sci ; 10(40): 9219-9232, 2019 Oct 28.

Artigo em Inglês | MEDLINE | ID: mdl-32055308

RESUMO

Although computer programs for retrosynthetic planning have shown improved and in some cases quite satisfactory performance in designing routes leading to specific, individual targets, no algorithms capable of planning syntheses of entire target libraries - important in modern drug discovery - have yet been reported. This study describes how network-search routines underlying existing retrosynthetic programs can be adapted and extended to multi-target design operating on one common search graph, benefitting from the use of common intermediates and reducing the overall synthetic cost. Implementation in the Chematica platform illustrates the usefulness of such algorithms in the syntheses of either (i) all members of a user-defined library, or (ii) the most synthetically accessible members of this library. In the latter case, algorithms are also readily adapted to the identification of the most facile syntheses of isotopically labelled targets. These examples are industrially relevant in the context of hit-to-lead optimization and syntheses of isotopomers of various bioactive molecules.

9.

Computer-Assisted Synthetic Planning: The End of the Beginning.

Szymkuc, Sara; Gajewska, Ewa P; Klucznik, Tomasz; Molga, Karol; Dittwald, Piotr; Startek, Michal; Bajczyk, Michal; Grzybowski, Bartosz A.

Angew Chem Int Ed Engl ; 55(20): 5904-37, 2016 05 10.

Artigo em Inglês | MEDLINE | ID: mdl-27062365

RESUMO

Exactly half a century has passed since the launch of the first documented research project (1965 Dendral) on computer-assisted organic synthesis. Many more programs were created in the 1970s and 1980s but the enthusiasm of these pioneering days had largely dissipated by the 2000s, and the challenge of teaching the computer how to plan organic syntheses earned itself the reputation of a "mission impossible". This is quite curious given that, in the meantime, computers have "learned" many other skills that had been considered exclusive domains of human intellect and creativity-for example, machines can nowadays play chess better than human world champions and they can compose classical music pleasant to the human ear. Although there have been no similar feats in organic synthesis, this Review argues that to concede defeat would be premature. Indeed, bringing together the combination of modern computational power and algorithms from graph/network theory, chemical rules (with full stereo- and regiochemistry) coded in appropriate formats, and the elements of quantum mechanics, the machine can finally be "taught" how to plan syntheses of non-trivial organic molecules in a matter of seconds to minutes. The Review begins with an overview of some basic theoretical concepts essential for the big-data analysis of chemical syntheses. It progresses to the problem of optimizing pathways involving known reactions. It culminates with discussion of algorithms that allow for a completely de novo and fully automated design of syntheses leading to relatively complex targets, including those that have not been made before. Of course, there are still things to be improved, but computers are finally becoming relevant and helpful to the practice of organic-synthetic planning. Paraphrasing Churchill's famous words after the Allies' first major victory over the Axis forces in Africa, it is not the end, it is not even the beginning of the end, but it is the end of the beginning for the computer-assisted synthesis planning. The machine is here to stay.

10.

A Priori Estimation of Organic Reaction Yields.

Emami, Fateme S; Vahid, Amir; Wylie, Elizabeth K; Szymkuc, Sara; Dittwald, Piotr; Molga, Karol; Grzybowski, Bartosz A.

Angew Chem Int Ed Engl ; 54(37): 10797-801, 2015 Sep 07.

Artigo em Inglês | MEDLINE | ID: mdl-26215084

RESUMO

A thermodynamically guided calculation of free energies of substrate and product molecules allows for the estimation of the yields of organic reactions. The non-ideality of the system and the solvent effects are taken into account through the activity coefficients calculated at the molecular level by perturbed-chain statistical associating fluid theory (PC-SAFT). The model is iteratively trained using a diverse set of reactions with yields that have been reported previously. This trained model can then estimate aâpriori the yields of reactions not included in the training set with an accuracy of ca. ±15 %. This ability has the potential to translate into significant economic savings through the selection and then execution of only those reactions that can proceed in good yields.

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA