Collaborative analysis for drug discovery by federated learning on non-IID data.

Huang, Dong; Ye, Xiucai; Zhang, Ying; Sakurai, Tetsuya

Huang, Dong; Ye, Xiucai; Zhang, Ying; Sakurai, Tetsuya.

Afiliação

Huang D; Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan.
Ye X; Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan. Electronic address: yexiucai@cs.tsukuba.ac.jp.
Zhang Y; Beidahuang Industry Group General Hospital, Harbin, China. Electronic address: zhangybiggh@yeah.net.
Sakurai T; Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan.

Methods ; 219: 1-7, 2023 11.

Article em En | MEDLINE | ID: mdl-37689121

RESUMO

With the increasing availability of large-scale QSAR (Quantitative Structure-Activity Relationship) datasets, collaborative analysis has become a promising approach for drug discovery. Traditional centralized analysis which typically concentrates data on a central server for training faces challenges such as data privacy and security. Distributed analysis such as federated learning offers a solution by enabling collaborative model training without sharing raw data. However, it may fail when the training data in the local devices are non-independent and identically distributed (non-IID). In this paper, we propose a novel framework for collaborative drug discovery using federated learning on non-IID datasets. We address the difficulty of training on non-IID data by globally sharing a small subset of data among all institutions. Our framework allows multiple institutions to jointly train a robust predictive model while preserving the privacy of their individual data. We leverage the federated learning paradigm to distribute the model training process across local devices, eliminating the need for data exchange. The experimental results on 15 benchmark datasets demonstrate that the proposed method achieves competitive predictive accuracy to centralized analysis while respecting data privacy. Moreover, our framework offers benefits such as reduced data transmission and enhanced scalability, making it suitable for large-scale collaborative drug discovery efforts.

Assuntos

Benchmarking; Descoberta de Drogas

Palavras-chave

Collaborative analysis; Drug discovery; Federated learning; Non-IID dataset; Quantitative structureactivity relationship

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Benchmarking / Descoberta de Drogas Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google