Your browser doesn't support javascript.
loading
Positive-unlabeled learning in bioinformatics and computational biology: a brief review.
Li, Fuyi; Dong, Shuangyu; Leier, André; Han, Meiya; Guo, Xudong; Xu, Jing; Wang, Xiaoyu; Pan, Shirui; Jia, Cangzhi; Zhang, Yang; Webb, Geoffrey I; Coin, Lachlan J M; Li, Chen; Song, Jiangning.
Afiliação
  • Li F; Monash University, Australia.
  • Dong S; Monash University, Australia.
  • Leier A; Department of Genetics, UAB School of Medicine, USA.
  • Han M; Department of Biochemistry and Molecular Biology, Monash University, Australia.
  • Guo X; Ningxia University, China.
  • Xu J; Computer Science and Technology from Nankai University, China.
  • Wang X; Department of Biochemistry and Molecular Biology and Biomedicine Discovery Institute, Monash University, Australia.
  • Pan S; University of Technology Sydney (UTS), Ultimo, NSW, Australia.
  • Jia C; College of Science, Dalian Maritime University, Australia.
  • Zhang Y; Northwestern Polytechnical University, China.
  • Webb GI; Faculty of Information Technology at Monash University, Australia.
  • Coin LJM; Department of Clinical Pathology, University of Melbourne, Australia.
  • Li C; Biomedicine Discovery Institute and Department of Biochemistry of Molecular Biology, Monash University, Australia.
  • Song J; Monash Biomedicine Discovery Institute, Monash University, Melbourne, Australia.
Brief Bioinform ; 23(1)2022 01 17.
Article em En | MEDLINE | ID: mdl-34729589
ABSTRACT
Conventional supervised binary classification algorithms have been widely applied to address significant research questions using biological and biomedical data. This classification scheme requires two fully labeled classes of data (e.g. positive and negative samples) to train a classification model. However, in many bioinformatics applications, labeling data is laborious, and the negative samples might be potentially mislabeled due to the limited sensitivity of the experimental equipment. The positive unlabeled (PU) learning scheme was therefore proposed to enable the classifier to learn directly from limited positive samples and a large number of unlabeled samples (i.e. a mixture of positive or negative samples). To date, several PU learning algorithms have been developed to address various biological questions, such as sequence identification, functional site characterization and interaction prediction. In this paper, we revisit a collection of 29 state-of-the-art PU learning bioinformatic applications to address various biological questions. Various important aspects are extensively discussed, including PU learning methodology, biological application, classifier design and evaluation strategy. We also comment on the existing issues of PU learning and offer our perspectives for the future development of PU learning applications. We anticipate that our work serves as an instrumental guideline for a better understanding of the PU learning framework in bioinformatics and further developing next-generation PU learning frameworks for critical biological applications.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Biologia Computacional Tipo de estudo: Prognostic_studies Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Austrália

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Biologia Computacional Tipo de estudo: Prognostic_studies Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Austrália