RESUMO
Ligand-based drug design has recently benefited from the development of deep generative models. These models enable extensive explorations of the chemical space and provide a platform for molecular optimization. However, the vast majority of current methods does not leverage the structure of the binding target, which potentiates the binding of small molecules and plays a key role in the interaction. We propose an optimization pipeline that leverages complementary structure-based and ligand-based methods. Instead of performing docking on a fixed chemical library, we iteratively select promising compounds in the full chemical space using a ligand-centered generative model. Molecular docking is then used as an oracle to guide compound optimization. This allows for iterative generation of compounds that fit the target structure better and better, without prior knowledge about bioactives. For this purpose, we introduce a new graph to Selfies Variational Autoencoder (VAE) which benefits from an 18-fold faster decoding than the graph to graph state of the art, while achieving a similar performance. We then successfully optimize the generation of molecules toward high docking scores, enabling a 10-fold enrichment of high-scoring compounds found with a fixed computational cost.