MolFeSCue: enhancing molecular property prediction in data-limited and imbalanced contexts using few-shot and contrastive learning.

Zhang, Ruochi; Wu, Chao; Yang, Qian; Liu, Chang; Wang, Yan; Li, Kewei; Huang, Lan; Zhou, Fengfeng

Zhang, Ruochi; Wu, Chao; Yang, Qian; Liu, Chang; Wang, Yan; Li, Kewei; Huang, Lan; Zhou, Fengfeng.

Afiliação

Zhang R; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China.
Wu C; School of Artificial Intelligence, Jilin University, Changchun 130012, China.
Yang Q; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China.
Liu C; College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China.
Wang Y; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China.
Li K; College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China.
Huang L; Beijing Life Science Academy, Beijing 102209, China.
Zhou F; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China.

Bioinformatics ; 40(4)2024 Mar 29.

Article em En | MEDLINE | ID: mdl-38426310

ABSTRACT

ABSTRACT

MOTIVATION Predicting molecular properties is a pivotal task in various scientific domains, including drug discovery, material science, and computational chemistry. This problem is often hindered by the lack of annotated data and imbalanced class distributions, which pose significant challenges in developing accurate and robust predictive models.

RESULTS:

This study tackles these issues by employing pretrained molecular models within a few-shot learning framework. A novel dynamic contrastive loss function is utilized to further improve model performance in the situation of class imbalance. The proposed MolFeSCue framework not only facilitates rapid generalization from minimal samples, but also employs a contrastive loss function to extract meaningful molecular representations from imbalanced datasets. Extensive evaluations and comparisons of MolFeSCue and state-of-the-art algorithms have been conducted on multiple benchmark datasets, and the experimental data demonstrate our algorithm's effectiveness in molecular representations and its broad applicability across various pretrained models. Our findings underscore MolFeSCues potential to accelerate advancements in drug discovery. AVAILABILITY AND IMPLEMENTATION We have made all the source code utilized in this study publicly accessible via GitHub at http//www.healthinformaticslab.org/supp/ or https//github.com/zhangruochi/MolFeSCue. The code (MolFeSCue-v1-00) is also available as the supplementary file of this paper.

Assuntos

Algoritmos; Benchmarking; Descoberta de Drogas; Modelos Moleculares; Software

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Benchmarking Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google