Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
1.
Nature ; 625(7995): 508-515, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37967579

ABSTRACT

Recent years have seen revived interest in computer-assisted organic synthesis1,2. The use of reaction- and neural-network algorithms that can plan multistep synthetic pathways have revolutionized this field1,3-7, including examples leading to advanced natural products6,7. Such methods typically operate on full, literature-derived 'substrate(s)-to-product' reaction rules and cannot be easily extended to the analysis of reaction mechanisms. Here we show that computers equipped with a comprehensive knowledge-base of mechanistic steps augmented by physical-organic chemistry rules, as well as quantum mechanical and kinetic calculations, can use a reaction-network approach to analyse the mechanisms of some of the most complex organic transformations: namely, cationic rearrangements. Such rearrangements are a cornerstone of organic chemistry textbooks and entail notable changes in the molecule's carbon skeleton8-12. The algorithm we describe and deploy at https://HopCat.allchemy.net/ generates, within minutes, networks of possible mechanistic steps, traces plausible step sequences and calculates expected product distributions. We validate this algorithm by three sets of experiments whose analysis would probably prove challenging even to highly trained chemists: (1) predicting the outcomes of tail-to-head terpene (THT) cyclizations in which substantially different outcomes are encoded in modular precursors differing in minute structural details; (2) comparing the outcome of THT cyclizations in solution or in a supramolecular capsule; and (3) analysing complex reaction mixtures. Our results support a vision in which computers no longer just manipulate known reaction types1-7 but will help rationalize and discover new, mechanistically complex transformations.


Subject(s)
Algorithms , Chemistry Techniques, Synthetic , Cyclization , Neural Networks, Computer , Terpenes , Cations/chemistry , Knowledge Bases , Terpenes/chemistry , Chemistry Techniques, Synthetic/methods , Biological Products/chemical synthesis , Biological Products/chemistry , Reproducibility of Results , Solutions
2.
Nature ; 588(7836): 83-88, 2020 12.
Article in English | MEDLINE | ID: mdl-33049755

ABSTRACT

Training algorithms to computationally plan multistep organic syntheses has been a challenge for more than 50 years1-7. However, the field has progressed greatly since the development of early programs such as LHASA1,7, for which reaction choices at each step were made by human operators. Multiple software platforms6,8-14 are now capable of completely autonomous planning. But these programs 'think' only one step at a time and have so far been limited to relatively simple targets, the syntheses of which could arguably be designed by human chemists within minutes, without the help of a computer. Furthermore, no algorithm has yet been able to design plausible routes to complex natural products, for which much more far-sighted, multistep planning is necessary15,16 and closely related literature precedents cannot be relied on. Here we demonstrate that such computational synthesis planning is possible, provided that the program's knowledge of organic chemistry and data-based artificial intelligence routines are augmented with causal relationships17,18, allowing it to 'strategize' over multiple synthetic steps. Using a Turing-like test administered to synthesis experts, we show that the routes designed by such a program are largely indistinguishable from those designed by humans. We also successfully validated three computer-designed syntheses of natural products in the laboratory. Taken together, these results indicate that expert-level automated synthetic planning is feasible, pending continued improvements to the reaction knowledge base and further code optimization.


Subject(s)
Artificial Intelligence , Biological Products/chemical synthesis , Chemistry Techniques, Synthetic/methods , Chemistry, Organic/methods , Software , Artificial Intelligence/standards , Automation/methods , Automation/standards , Benzylisoquinolines/chemical synthesis , Benzylisoquinolines/chemistry , Chemistry Techniques, Synthetic/standards , Chemistry, Organic/standards , Indans/chemical synthesis , Indans/chemistry , Indole Alkaloids/chemical synthesis , Indole Alkaloids/chemistry , Knowledge Bases , Lactones/chemical synthesis , Lactones/chemistry , Macrolides/chemical synthesis , Macrolides/chemistry , Reproducibility of Results , Sesquiterpenes/chemical synthesis , Sesquiterpenes/chemistry , Software/standards , Tetrahydroisoquinolines/chemical synthesis , Tetrahydroisoquinolines/chemistry
3.
Nat Mater ; 23(1): 108-115, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37919351

ABSTRACT

Multi-metal oxides in general and perovskite oxides in particular have attracted considerable attention as oxygen evolution electrocatalysts. Although numerous theoretical studies have been undertaken, the most promising perovskite-based catalysts continue to emerge from human-driven experimental campaigns rather than data-driven machine learning protocols, which are often limited by the scarcity of experimental data on which to train the models. This work promises to break this impasse by demonstrating that active learning on even small datasets-but supplemented by informative structural-characterization data and coupled with closed-loop experimentation-can yield materials of outstanding performance. The model we develop not only reproduces several non-obvious and actively studied experimental trends but also identifies a composition of a perovskite oxide electrocatalyst exhibiting an intrinsic overpotential at 10 mA cm-2oxide of 391 mV, which is among the lowest known of four-metal perovskite oxides.

4.
J Am Chem Soc ; 144(11): 4819-4827, 2022 03 23.
Article in English | MEDLINE | ID: mdl-35258973

ABSTRACT

Applications of machine learning (ML) to synthetic chemistry rely on the assumption that large numbers of literature-reported examples should enable construction of accurate and predictive models of chemical reactivity. This paper demonstrates that abundance of carefully curated literature data may be insufficient for this purpose. Using an example of Suzuki-Miyaura coupling with heterocyclic building blocks─and a carefully selected database of >10,000 literature examples─we show that ML models cannot offer any meaningful predictions of optimum reaction conditions, even if the search space is restricted to only solvents and bases. This result holds irrespective of the ML model applied (from simple feed-forward to state-of-the-art graph-convolution neural networks) or the representation to describe the reaction partners (various fingerprints, chemical descriptors, latent representations, etc.). In all cases, the ML methods fail to perform significantly better than naive assignments based on the sheer frequency of certain reaction conditions reported in the literature. These unsatisfactory results likely reflect subjective preferences of various chemists to use certain protocols, other biasing factors as mundane as availability of certain solvents/reagents, and/or a lack of negative data. These findings highlight the likely importance of systematically generating reliable and standardized data sets for algorithm training.


Subject(s)
Machine Learning , Neural Networks, Computer , Algorithms , Solvents
5.
J Am Chem Soc ; 143(4): 1807-1815, 2021 02 03.
Article in English | MEDLINE | ID: mdl-33471520

ABSTRACT

When an organometallic catalyst is tethered onto a nanoparticle and is embedded in a monolayer of longer ligands terminated in "gating" end-groups, these groups can control the access and orientation of the incoming substrates. In this way, a nonspecific catalyst can become enzyme-like: it can select only certain substrates from substrate mixtures and, quite remarkably, can also preorganize these substrates such that only some of their otherwise equivalent sites react. For a simple, copper-based click reaction catalyst and for gating ligands terminated in charged groups, both substrate- and site-selectivities are on the order of 100, which is all the more notable given the relative simplicity of the on-particle monolayers compared to the intricacy of enzymes' active sites. The strategy of self-assembling macromolecular, on-nanoparticle environments to enhance selectivities of "ordinary" catalysts presented here is extendable to other types of catalysts and gating based on electrostatics, hydrophobicity, and chirality, or the combinations of these effects. Rational design of such systems should be guided by theoretical models we also describe.

6.
Angew Chem Int Ed Engl ; 60(28): 15230-15235, 2021 07 05.
Article in English | MEDLINE | ID: mdl-33876554

ABSTRACT

This work describes a method to vectorize and Machine-Learn, ML, non-covalent interactions responsible for scaffold-directed reactions important in synthetic chemistry. Models trained on this representation predict correct face of approach in ca. 90 % of Michael additions or Diels-Alder cycloadditions. These accuracies are significantly higher than those based on traditional ML descriptors, energetic calculations, or intuition of experienced synthetic chemists. Our results also emphasize the importance of ML models being provided with relevant mechanistic knowledge; without such knowledge, these models cannot easily "transfer-learn" and extrapolate to previously unseen reaction mechanisms.

7.
J Am Chem Soc ; 141(43): 17142-17149, 2019 10 30.
Article in English | MEDLINE | ID: mdl-31633925

ABSTRACT

The ability to estimate the acidity of C-H groups within organic molecules in non-aqueous solvents is important in synthetic planning to correctly predict which protons will be abstracted in reactions such as alkylations, Michael additions, or aldol condensations. This Article describes the use of the so-called graph convolutional neural networks (GCNNs) to perform such predictions on the time scales of milliseconds and with accuracy comparing favorably with state-of-the-art solutions, including commercial ones. The crux of the method is to train GCNNs using descriptors that reflect not only topological but also chemical properties of atomic environments. The model is validated against adversarial controls, supplemented by the discussion of realistic synthetic problems (on which it correctly predicts the most acidic protons in >90% of cases), and accompanied by a Web application intended to aid the community in everyday synthetic planning.

8.
Angew Chem Int Ed Engl ; 58(14): 4515-4519, 2019 Mar 26.
Article in English | MEDLINE | ID: mdl-30398688

ABSTRACT

Machine learning can predict the major regio-, site-, and diastereoselective outcomes of Diels-Alder reactions better than standard quantum-mechanical methods and with accuracies exceeding 90 % provided that i) the diene/dienophile substrates are represented by "physical-organic" descriptors reflecting the electronic and steric characteristics of their substituents and ii) the positions of such substituents relative to the reaction core are encoded ("vectorized") in an informative way.

9.
J Comput Chem ; 34(21): 1797-9, 2013 Aug 05.
Article in English | MEDLINE | ID: mdl-23696072

ABSTRACT

The relative stability of biologically relevant, hydrogen bonded complexes with shortened distances can be assessed at low cost by the electrostatic multipole term alone more successfully than by ab initio methods. These results imply that atomic multipole moments may help improve ligand-receptor ranking predictions, particularly in cases where accurate structural data are not available.


Subject(s)
Coordination Complexes/chemistry , Quantum Theory , Catalytic Domain , Dimerization , Drug Stability , Hydrogen Bonding , Ligands , Models, Molecular , Receptors, Cell Surface/chemistry
10.
J Phys Chem A ; 117(7): 1596-600, 2013 Feb 21.
Article in English | MEDLINE | ID: mdl-23327161

ABSTRACT

The concept of the polarization justified Fukui functions has been tested for the set of model molecules: imidazole, oxazole, and thiazole. Calculations of the Fukui functions have been based on the molecular polarizability analysis, which makes them a potentially more sensitive analytical tool as compared to the classical density functional theory proposals, typically built on electron density only. Three selected molecules show distinct differences in their reactivity patterns, despite very close geometry and electronic structure. The maps of the polarization justified Fukui functions on the molecular plane correctly identify important features of the molecules: the site for the preferential electrophilic attack in imidazole (-NH, see the TOC image) and oxazole (5-C), as well as uniquely aromatic character of the thiazole molecule and the acidic forms XH(+) of all three species.

11.
Science ; 378(6618): 399-405, 2022 10 28.
Article in English | MEDLINE | ID: mdl-36302014

ABSTRACT

General conditions for organic reactions are important but rare, and efforts to identify them usually consider only narrow regions of chemical space. Discovering more general reaction conditions requires considering vast regions of chemical space derived from a large matrix of substrates crossed with a high-dimensional matrix of reaction conditions, rendering exhaustive experimentation impractical. Here, we report a simple closed-loop workflow that leverages data-guided matrix down-selection, uncertainty-minimizing machine learning, and robotic experimentation to discover general reaction conditions. Application to the challenging and consequential problem of heteroaryl Suzuki-Miyaura cross-coupling identified conditions that double the average yield relative to a widely used benchmark that was previously developed using traditional approaches. This study provides a practical road map for solving multidimensional chemical optimization problems with large search spaces.

12.
J Chem Theory Comput ; 16(5): 3420-3429, 2020 May 12.
Article in English | MEDLINE | ID: mdl-32282205

ABSTRACT

Currently developed protocols of theozyme design still lead to biocatalysts with much lower catalytic activity than enzymes existing in nature, and, so far, the only avenue of improvement was the in vitro laboratory-directed evolution (LDE) experiments. In this paper, we propose a different strategy based on "reversed" methodology of mutation prediction. Instead of common "top-down" approach, requiring numerous assumptions and vast computational effort, we argue for a "bottom-up" approach that is based on the catalytic fields derived directly from transition state and reactant complex wave functions. This enables direct one-step determination of the general quantitative angular characteristics of optimal catalytic site and simultaneously encompasses both the transition-state stabilization (TSS) and ground-state destabilization (GSD) effects. We further extend the static catalytic field approach by introducing a library of atomic multipoles for amino acid side-chain rotamers, which, together with the catalytic field, allow one to determine the optimal side-chain orientations of charged amino acids constituting the elusive structure of a preorganized catalytic environment. Obtained qualitative agreement with experimental LDE data for Kemp eliminase KE07 mutants validates the proposed procedure, yielding, in addition, a detailed insight into possible dynamic and epistatic effects.


Subject(s)
Lyases/metabolism , Amino Acids/chemistry , Amino Acids/metabolism , Biocatalysis , Catalytic Domain , Lyases/genetics , Oxidation-Reduction , Thermodynamics
13.
Chem Sci ; 11(26): 6736-6744, 2020 Jul 14.
Article in English | MEDLINE | ID: mdl-33033595

ABSTRACT

A computer program for retrosynthetic planning helps develop multiple "synthetic contingency" plans for hydroxychloroquine and also routes leading to remdesivir, both promising but yet unproven medications against COVID-19. These plans are designed to navigate, as much as possible, around known and patented routes and to commence from inexpensive and diverse starting materials, so as to ensure supply in case of anticipated market shortages of commonly used substrates. Looking beyond the current COVID-19 pandemic, development of similar contingency syntheses is advocated for other already-approved medications, in case such medications become urgently needed in mass quantities to face other public-health emergencies.

14.
Science ; 369(6511)2020 09 25.
Article in English | MEDLINE | ID: mdl-32973002

ABSTRACT

The challenge of prebiotic chemistry is to trace the syntheses of life's key building blocks from a handful of primordial substrates. Here we report a forward-synthesis algorithm that generates a full network of prebiotic chemical reactions accessible from these substrates under generally accepted conditions. This network contains both reported and previously unidentified routes to biotic targets, as well as plausible syntheses of abiotic molecules. It also exhibits three forms of nontrivial chemical emergence, as the molecules within the network can act as catalysts of downstream reaction types; form functional chemical systems, including self-regenerating cycles; and produce surfactants relevant to primitive forms of biological compartmentalization. To support these claims, computer-predicted, prebiotic syntheses of several biotic molecules as well as a multistep, self-regenerative cycle of iminodiacetic acid were validated by experiment.


Subject(s)
Organic Chemicals/chemical synthesis , Origin of Life , Computer Simulation
15.
J Mol Model ; 24(1): 28, 2017 Dec 22.
Article in English | MEDLINE | ID: mdl-29274012

ABSTRACT

Catalytic fields illustrate topology of the optimal charge distribution of a molecular environment reducing the activation energy for any process involving barrier crossing, like chemical reaction, bond rotation etc. Until now, this technique has been successfully applied to predict catalytic effects resulting from intermolecular interactions with individual water molecules constituting the first hydration shell, aminoacid mutations in enzymes or Si→Al substitutions in zeolites. In this contribution, hydrogen to fluorine (H→F) substitution effects for two model reactions have been examined indicating qualitative applicability of the catalytic field concept in the case of systems involving intramolecular interactions. Graphical abstract Hydrogen to fluorine (H→F) substitution effects on activation energy in [kcal/mol].

16.
J Chem Theory Comput ; 13(2): 945-955, 2017 Feb 14.
Article in English | MEDLINE | ID: mdl-28103023

ABSTRACT

We propose a simple atomic multipole electrostatic model to rapidly evaluate the effects of mutation on enzyme activity and test its performance on wild-type and mutant ketosteroid isomerase. The predictions of our atomic multipole model are similar to those obtained with symmetry-adapted perturbation theory at a fraction of the computational cost. We further show that this approach is relatively insensitive to the precise amino acid side chain conformation in mutants and may thus be useful in computational enzyme (re)design.


Subject(s)
Biocatalysis , Molecular Dynamics Simulation , Mutant Proteins/chemistry , Mutant Proteins/metabolism , Mutation , Steroid Isomerases/chemistry , Steroid Isomerases/metabolism , Androstenedione/chemistry , Androstenedione/metabolism , Catalytic Domain , Isomerism , Mutant Proteins/genetics , Static Electricity , Steroid Isomerases/genetics , Thermodynamics
17.
J Phys Chem B ; 118(51): 14727-36, 2014 Dec 26.
Article in English | MEDLINE | ID: mdl-25420234

ABSTRACT

Fatty acid amide hydrolase (FAAH) is an enzyme responsible for the deactivating hydrolysis of fatty acid ethanolamide neuromodulators. FAAH inhibitors have gained considerable interest due to their possible application in the treatment of anxiety, inflammation, and pain. In the context of inhibitor design, the availability of reliable computational tools for predicting binding affinity is still a challenging task, and it is now well understood that empirical scoring functions have several limitations that in principle could be overcome by quantum mechanics. Herein, systematic ab initio analyses of FAAH interactions with a series of inhibitors belonging to the class of the N-alkylcarbamic acid aryl esters have been performed. In contrast to our earlier studies of other classes of enzyme-inhibitor complexes, reasonable correlation with experimental results required us to consider correlation effects along with electrostatic term. Therefore, the simplest comprehensive nonempirical model allowing for qualitative predictions of binding affinities for FAAH ligands consists of electrostatic multipole and second-order dispersion terms. Such a model has been validated against the relative stabilities of the benchmark S66 set of biomolecular complexes. As it does not involve parameters fitted to experimentally derived data, this model offers a unique opportunity for generally applicable inhibitor design and virtual screening.


Subject(s)
Amidohydrolases/chemistry , Enzyme Inhibitors/chemistry , Models, Chemical , Ligands
SELECTION OF CITATIONS
SEARCH DETAIL