Your browser doesn't support javascript.
loading
Partial order relation-based gene ontology embedding improves protein function prediction.
Li, Wenjing; Wang, Bin; Dai, Jin; Kou, Yan; Chen, Xiaojun; Pan, Yi; Hu, Shuangwei; Xu, Zhenjiang Zech.
Afiliação
  • Li W; College of Computer Science and Software, Shenzhen University, Shenzhen, China.
  • Wang B; School of Mathematics and Computer Sciences, Nanchang University, Nanchang, China.
  • Dai J; Center for Quantum Technology Research and School of Physics, Beijing Institute of Technology, Beijing, China.
  • Kou Y; Xbiome, Scientific Research Building, Tsinghua High-Tech Park, Shenzhen, China.
  • Chen X; College of Computer Science and Software, Shenzhen University, Shenzhen, China.
  • Pan Y; Faculty of Computer Science and Control Engineering Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, China.
  • Hu S; Xbiome, Scientific Research Building, Tsinghua High-Tech Park, Shenzhen, China.
  • Xu ZZ; School of Mathematics and Computer Sciences, Nanchang University, Nanchang, China.
Brief Bioinform ; 25(2)2024 Jan 22.
Article em En | MEDLINE | ID: mdl-38446740
ABSTRACT
Protein annotation has long been a challenging task in computational biology. Gene Ontology (GO) has become one of the most popular frameworks to describe protein functions and their relationships. Prediction of a protein annotation with proper GO terms demands high-quality GO term representation learning, which aims to learn a low-dimensional dense vector representation with accompanying semantic meaning for each functional label, also known as embedding. However, existing GO term embedding methods, which mainly take into account ancestral co-occurrence information, have yet to capture the full topological information in the GO-directed acyclic graph (DAG). In this study, we propose a novel GO term representation learning method, PO2Vec, to utilize the partial order relationships to improve the GO term representations. Extensive evaluations show that PO2Vec achieves better outcomes than existing embedding methods in a variety of downstream biological tasks. Based on PO2Vec, we further developed a new protein function prediction method PO2GO, which demonstrates superior performance measured in multiple metrics and annotation specificity as well as few-shot prediction capability in the benchmarks. These results suggest that the high-quality representation of GO structure is critical for diverse biological tasks including computational protein annotation.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Biologia Computacional / Benchmarking Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Biologia Computacional / Benchmarking Idioma: En Ano de publicação: 2024 Tipo de documento: Article