Your browser doesn't support javascript.
loading
Learning representations for gene ontology terms by jointly encoding graph structure and textual node descriptors.
Zhao, Lingling; Sun, Huiting; Cao, Xinyi; Wen, Naifeng; Wang, Junjie; Wang, Chunyu.
Afiliação
  • Zhao L; Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China.
  • Sun H; Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China.
  • Cao X; Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China.
  • Wen N; College of Electromechanical and Information Engineering, Dalian Minzu University, Dalian 116600, China.
  • Wang J; Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China.
  • Wang C; Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China.
Brief Bioinform ; 23(5)2022 09 20.
Article em En | MEDLINE | ID: mdl-35901452
ABSTRACT
Measuring the semantic similarity between Gene Ontology (GO) terms is a fundamental step in numerous functional bioinformatics applications. To fully exploit the metadata of GO terms, word embedding-based methods have been proposed recently to map GO terms to low-dimensional feature vectors. However, these representation methods commonly overlook the key information hidden in the whole GO structure and the relationship between GO terms. In this paper, we propose a novel representation model for GO terms, named GT2Vec, which jointly considers the GO graph structure obtained by graph contrastive learning and the semantic description of GO terms based on BERT encoders. Our method is evaluated on a protein similarity task on a collection of benchmark datasets. The experimental results demonstrate the effectiveness of using a joint encoding graph structure and textual node descriptors to learn vector representations for GO terms.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Semântica / Biologia Computacional Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Semântica / Biologia Computacional Idioma: En Ano de publicação: 2022 Tipo de documento: Article