Your browser doesn't support javascript.
loading
Predicting protein functions using positive-unlabeled ranking with ontology-based priors.
Zhapa-Camacho, Fernando; Tang, Zhenwei; Kulmanov, Maxat; Hoehndorf, Robert.
Afiliação
  • Zhapa-Camacho F; Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia.
  • Tang Z; Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia.
  • Kulmanov M; Department of Computer Science, University of Toronto, Toronto, ON M5S 1A1, Canada.
  • Hoehndorf R; Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia.
Bioinformatics ; 40(Supplement_1): i401-i409, 2024 Jun 28.
Article em En | MEDLINE | ID: mdl-38940168
ABSTRACT
Automated protein function prediction is a crucial and widely studied problem in bioinformatics. Computationally, protein function is a multilabel classification problem where only positive samples are defined and there is a large number of unlabeled annotations. Most existing methods rely on the assumption that the unlabeled set of protein function annotations are negatives, inducing the false negative issue, where potential positive samples are trained as negatives. We introduce a novel approach named PU-GO, wherein we address function prediction as a positive-unlabeled ranking problem. We apply empirical risk minimization, i.e. we minimize the classification risk of a classifier where class priors are obtained from the Gene Ontology hierarchical structure. We show that our approach is more robust than other state-of-the-art methods on similarity-based and time-based benchmark datasets. AVAILABILITY AND IMPLEMENTATION Data and code are available at https//github.com/bio-ontology-research-group/PU-GO.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Biologia Computacional / Ontologia Genética Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Arábia Saudita

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Biologia Computacional / Ontologia Genética Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Arábia Saudita