Rethinking the Masking Strategy for Pretraining Molecular Graphs from a Data-Centric View.

Lin, Wei; Fung, Chi Chung Alan

Lin, Wei; Fung, Chi Chung Alan.

Afiliación

Lin W; Department of Neuroscience, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Kowloon 999077, Hong Kong, China.
Fung CCA; Department of Neuroscience, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Kowloon 999077, Hong Kong, China.

ACS Omega ; 9(19): 20832-20838, 2024 May 14.

Article en En | MEDLINE | ID: mdl-38764692

ABSTRACT

ABSTRACT

Node-level self-supervised learning has been widely applied for pretraining molecular graphs. Attribute Masking (AttrMask) is pioneering work in this field, and its improved methods focus on enhancing the capacity of the backbone models by incorporating additional modules. However, these methods overlook the imbalanced atom distribution due to employing only the random masking strategy to mask atoms for pretraining. According to the properties of molecules, we propose a weighted masking strategy to enhance the capacity of pretrained models by more effective utilization of molecular information while pretraining. Our experimental results demonstrate that AttrMask combined with our proposed weighted masking strategy yields superior performance compared to the random masking strategy, even surpassing the model-centric improvement methods without increasing the parameters. Additionally, our weighted masking strategy can be extended to other pretraining methods to achieve enhanced performance.

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: ACS Omega Año: 2024 Tipo del documento: Article País de afiliación: China

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: ACS Omega Año: 2024 Tipo del documento: Article País de afiliación: China