Extension of multi-site analogue series with potent compounds using a bidirectional transformer-based chemical language model.
RSC Med Chem
; 15(7): 2527-2537, 2024 Jul 17.
Article
em En
| MEDLINE
| ID: mdl-39026633
ABSTRACT
Generating potent compounds for evolving analogue series (AS) is a key challenge in medicinal chemistry. The versatility of chemical language models (CLMs) makes it possible to formulate this challenge as an off-the-beaten-path prediction task. In this work, we have devised a coding and tokenization scheme for evolving AS with multiple substitution sites (multi-site AS) and implemented a bidirectional transformer to predict new potent analogues for such series. Scientific foundations of this approach are discussed and, as a benchmark, the transformer model is compared to a recurrent neural network (RNN) for the prediction of analogues of AS with single substitution sites. Furthermore, the transformer is shown to successfully predict potent analogues with varying R-group combinations for multi-site AS having activity against many different targets. Prediction of R-group combinations for extending AS with potent compounds represents a novel approach for compound optimization.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Idioma:
En
Revista:
RSC Med Chem
Ano de publicação:
2024
Tipo de documento:
Article
País de publicação:
Reino Unido