VirRep: a hybrid language representation learning framework for identifying viruses from human gut metagenomes.
Genome Biol
; 25(1): 177, 2024 Jul 04.
Article
em En
| MEDLINE
| ID: mdl-38965579
ABSTRACT
Identifying viruses from metagenomes is a common step to explore the virus composition in the human gut. Here, we introduce VirRep, a hybrid language representation learning framework, for identifying viruses from human gut metagenomes. VirRep combines a context-aware encoder and an evolution-aware encoder to improve sequence representation by incorporating k-mer patterns and sequence homologies. Benchmarking on both simulated and real datasets with varying viral proportions demonstrates that VirRep outperforms state-of-the-art methods. When applied to fecal metagenomes from a colorectal cancer cohort, VirRep identifies 39 high-quality viral species associated with the disease, many of which cannot be detected by existing methods.
Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Metagenoma
/
Microbioma Gastrointestinal
Limite:
Humans
Idioma:
En
Revista:
Genome Biol
Assunto da revista:
BIOLOGIA MOLECULAR
/
GENETICA
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
China
País de publicação:
ENGLAND
/
ESCOCIA
/
GB
/
GREAT BRITAIN
/
INGLATERRA
/
REINO UNIDO
/
SCOTLAND
/
UK
/
UNITED KINGDOM