An effective self-supervised framework for learning expressive molecular global representations to drug discovery.
Brief Bioinform
; 22(6)2021 11 05.
Article
en En
| MEDLINE
| ID: mdl-33940598
How to produce expressive molecular representations is a fundamental challenge in artificial intelligence-driven drug discovery. Graph neural network (GNN) has emerged as a powerful technique for modeling molecular data. However, previous supervised approaches usually suffer from the scarcity of labeled data and poor generalization capability. Here, we propose a novel molecular pre-training graph-based deep learning framework, named MPG, that learns molecular representations from large-scale unlabeled molecules. In MPG, we proposed a powerful GNN for modelling molecular graph named MolGNet, and designed an effective self-supervised strategy for pre-training the model at both the node and graph-level. After pre-training on 11 million unlabeled molecules, we revealed that MolGNet can capture valuable chemical insights to produce interpretable representation. The pre-trained MolGNet can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of drug discovery tasks, including molecular properties prediction, drug-drug interaction and drug-target interaction, on 14 benchmark datasets. The pre-trained MolGNet in MPG has the potential to become an advanced molecular encoder in the drug discovery pipeline.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Modelos Moleculares
/
Redes Neurales de la Computación
/
Sistemas de Liberación de Medicamentos
/
Descubrimiento de Drogas
/
Bases de Datos de Compuestos Químicos
Tipo de estudio:
Prognostic_studies
Idioma:
En
Revista:
Brief Bioinform
Asunto de la revista:
BIOLOGIA
/
INFORMATICA MEDICA
Año:
2021
Tipo del documento:
Article
País de afiliación:
China
Pais de publicación:
Reino Unido