Your browser doesn't support javascript.
loading
VirGrapher: a graph-based viral identifier for long sequences from metagenomes.
Miao, Yan; Sun, Zhenyuan; Ma, Chenjing; Lin, Chen; Wang, Guohua; Yang, Chunxue.
Afiliação
  • Miao Y; College of Computer and Control Engineering, Northeast Forestry University, Hexing Road, 150040, Heilongjiang Province, China.
  • Sun Z; College of Computer and Control Engineering, Northeast Forestry University, Hexing Road, 150040, Heilongjiang Province, China.
  • Ma C; College of Computer and Control Engineering, Northeast Forestry University, Hexing Road, 150040, Heilongjiang Province, China.
  • Lin C; National Institute for Data Science in Health and Medicine, Xiamen University, Xiangannan Road, 361104, Fujian Province, China.
  • Wang G; College of Computer and Control Engineering, Northeast Forestry University, Hexing Road, 150040, Heilongjiang Province, China.
  • Yang C; College of Landscape Architecture, Northeast Forestry University, Hexing Road, 150040, Heilongjiang Province, China.
Brief Bioinform ; 25(2)2024 Jan 22.
Article em En | MEDLINE | ID: mdl-38343326
ABSTRACT
Viruses are the most abundant biological entities on earth and are important components of microbial communities. A metagenome contains all microorganisms from an environmental sample. Correctly identifying viruses from these mixed sequences is critical in viral analyses. It is common to identify long viral sequences, which has already been passed thought pipelines of assembly and binning. Existing deep learning-based methods divide these long sequences into short subsequences and identify them separately. This makes the relationships between them be omitted, leading to poor performance on identifying long viral sequences. In this paper, VirGrapher is proposed to improve the identification performance of long viral sequences by constructing relationships among short subsequences from long ones. VirGrapher see a long sequence as a graph and uses a Graph Convolutional Network (GCN) model to learn multilayer connections between nodes from sequences after a GCN-based node embedding model. VirGrapher achieves a better AUC value and accuracy on validation set, which is better than three benchmark methods.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Metagenoma / Microbiota Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Metagenoma / Microbiota Idioma: En Ano de publicação: 2024 Tipo de documento: Article