Your browser doesn't support javascript.
loading
Do Chemformers Dream of Organic Matter? Evaluating a Transformer Model for Multistep Retrosynthesis.
Westerlund, Annie M; Manohar Koki, Siva; Kancharla, Supriya; Tibo, Alessandro; Saigiridharan, Lakshidaa; Kabeshov, Mikhail; Mercado, Rocío; Genheden, Samuel.
Afiliação
  • Westerlund AM; Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden.
  • Manohar Koki S; Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden.
  • Kancharla S; Department of Computer Science and Engineering, Chalmers University of Technology, 412 96 Göteborg, Sweden.
  • Tibo A; Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden.
  • Saigiridharan L; Department of Computer Science and Engineering, Chalmers University of Technology, 412 96 Göteborg, Sweden.
  • Kabeshov M; Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden.
  • Mercado R; Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden.
  • Genheden S; Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden.
J Chem Inf Model ; 64(8): 3021-3033, 2024 04 22.
Article em En | MEDLINE | ID: mdl-38602390
ABSTRACT
Synthesis planning of new pharmaceutical compounds is a well-known bottleneck in modern drug design. Template-free methods, such as transformers, have recently been proposed as an alternative to template-based methods for single-step retrosynthetic predictions. Here, we trained and evaluated a transformer model, called the Chemformer, for retrosynthesis predictions within drug discovery. The proprietary data set used for training comprised ∼18 M reactions from literature, patents, and electronic lab notebooks. Chemformer was evaluated for the purpose of both single-step and multistep retrosynthesis. We found that the single-step performance of Chemformer was especially good on reaction classes common in drug discovery, with most reaction classes showing a top-10 round-trip accuracy above 0.97. Moreover, Chemformer reached a higher round-trip accuracy compared to that of a template-based model. By analyzing multistep retrosynthesis experiments, we observed that Chemformer found synthetic routes, leading to commercial starting materials for 95% of the target compounds, an increase of more than 20% compared to the template-based model on a proprietary compound data set. In addition to this, we discovered that Chemformer suggested novel disconnections corresponding to reaction templates, which are not included in the template-based model. These findings were further supported by a publicly available ChEMBL compound data set. The conclusions drawn from this work allow for the design of a synthesis planning tool where template-based and template-free models work in harmony to optimize retrosynthetic recommendations.
Assuntos

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Descoberta de Drogas Idioma: En Revista: J Chem Inf Model Assunto da revista: INFORMATICA MEDICA / QUIMICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Suécia

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Descoberta de Drogas Idioma: En Revista: J Chem Inf Model Assunto da revista: INFORMATICA MEDICA / QUIMICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Suécia