Toward Robust Self-Training Paradigm for Molecular Prediction Tasks.
J Comput Biol
; 31(3): 213-228, 2024 03.
Article
em En
| MEDLINE
| ID: mdl-38531049
ABSTRACT
Molecular prediction tasks normally demand a series of professional experiments to label the target molecule, which suffers from the limited labeled data problem. One of the semisupervised learning paradigms, known as self-training, utilizes both labeled and unlabeled data. Specifically, a teacher model is trained using labeled data and produces pseudo labels for unlabeled data. These labeled and pseudo-labeled data are then jointly used to train a student model. However, the pseudo labels generated from the teacher model are generally not sufficiently accurate. Thus, we propose a robust self-training strategy by exploring robust loss function to handle such noisy labels in two paradigms, that is, generic and adaptive. We have conducted experiments on three molecular biology prediction tasks with four backbone models to gradually evaluate the performance of the proposed robust self-training strategy. The results demonstrate that the proposed method enhances prediction performance across all tasks, notably within molecular regression tasks, where there has been an average enhancement of 41.5%. Furthermore, the visualization analysis confirms the superiority of our method. Our proposed robust self-training is a simple yet effective strategy that efficiently improves molecular biology prediction performance. It tackles the labeled data insufficient issue in molecular biology by taking advantage of both labeled and unlabeled data. Moreover, it can be easily embedded with any prediction task, which serves as a universal approach for the bioinformatics community.
Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Biologia Computacional
/
Biologia Molecular
Limite:
Humans
Idioma:
En
Revista:
J Comput Biol
/
J. comput. biol
/
Journal of computational biology
Assunto da revista:
BIOLOGIA MOLECULAR
/
INFORMATICA MEDICA
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
Estados Unidos