Your browser doesn't support javascript.
loading
DGCL: dual-graph neural networks contrastive learning for molecular property prediction.
Jiang, Xiuyu; Tan, Liqin; Zou, Qingsong.
Affiliation
  • Jiang X; School of Computer Science and Engineering, Sun Yat-sen University, Waihuan East Street, Guangzhou 510006, China.
  • Tan L; School of Computer Science and Engineering, Sun Yat-sen University, Waihuan East Street, Guangzhou 510006, China.
  • Zou Q; School of Computer Science and Engineering, Sun Yat-sen University, Waihuan East Street, Guangzhou 510006, China.
Brief Bioinform ; 25(6)2024 Sep 23.
Article in En | MEDLINE | ID: mdl-39331017
ABSTRACT
In this paper, we propose DGCL, a dual-graph neural networks (GNNs)-based contrastive learning (CL) integrated with mixed molecular fingerprints (MFPs) for molecular property prediction. The DGCL-MFP method contains two stages. In the first pretraining stage, we utilize two different GNNs as encoders to construct CL, rather than using the method of generating enhanced graphs as before. Precisely, DGCL aggregates and enhances features of the same molecule by the Graph Isomorphism Network and the Graph Attention Network, with representations extracted from the same molecule serving as positive samples, and others marked as negative ones. In the downstream tasks training stage, features extracted from the two above pretrained graph networks and the meticulously selected MFPs are concated together to predict molecular properties. Our experiments show that DGCL enhances the performance of existing GNNs by achieving or surpassing the state-of-the-art self-supervised learning models on multiple benchmark datasets. Specifically, DGCL increases the average performance of classification tasks by 3.73$\%$ and improves the performance of regression task Lipo by 0.126. Through ablation studies, we validate the impact of network fusion strategies and MFPs on model performance. In addition, DGCL's predictive performance is further enhanced by weighting different molecular features based on the Extended Connectivity Fingerprint. The code and datasets of DGCL will be made publicly available.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Neural Networks, Computer Language: En Journal: Brief Bioinform Journal subject: BIOLOGIA / INFORMATICA MEDICA Year: 2024 Document type: Article Affiliation country: China Country of publication: United kingdom

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Neural Networks, Computer Language: En Journal: Brief Bioinform Journal subject: BIOLOGIA / INFORMATICA MEDICA Year: 2024 Document type: Article Affiliation country: China Country of publication: United kingdom