Your browser doesn't support javascript.
loading
Improving VAE based molecular representations for compound property prediction.
Tevosyan, Ani; Khondkaryan, Lusine; Khachatrian, Hrant; Tadevosyan, Gohar; Apresyan, Lilit; Babayan, Nelly; Stopper, Helga; Navoyan, Zaven.
Afiliação
  • Tevosyan A; YerevaNN, Charents str. 20, 0025, Yerevan, Armenia.
  • Khondkaryan L; Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia.
  • Khachatrian H; YerevaNN, Charents str. 20, 0025, Yerevan, Armenia.
  • Tadevosyan G; Yerevan State University, Alex Manoogian str. 1, 0025, Yerevan, Armenia.
  • Apresyan L; Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia.
  • Babayan N; Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia.
  • Stopper H; Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia.
  • Navoyan Z; , Toxometris.ai, Sarmen str. 7, 0009, Yerevan, Armenia.
J Cheminform ; 14(1): 69, 2022 Oct 14.
Article em En | MEDLINE | ID: mdl-36242073
ABSTRACT
Collecting labeled data for many important tasks in chemoinformatics is time consuming and requires expensive experiments. In recent years, machine learning has been used to learn rich representations of molecules using large scale unlabeled molecular datasets and transfer the knowledge to solve the more challenging tasks with limited datasets. Variational autoencoders are one of the tools that have been proposed to perform the transfer for both chemical property prediction and molecular generation tasks. In this work we propose a simple method to improve chemical property prediction performance of machine learning models by incorporating additional information on correlated molecular descriptors in the representations learned by variational autoencoders. We verify the method on three property prediction tasks. We explore the impact of the number of incorporated descriptors, correlation between the descriptors and the target properties, sizes of the datasets etc. Finally, we show the relation between the performance of property prediction models and the distance between property prediction dataset and the larger unlabeled dataset in the representation space.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: J Cheminform Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Armênia

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: J Cheminform Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Armênia