Your browser doesn't support javascript.
loading
Metagenome comparison (MC): A new framework for detecting unique/enriched OMUs (operational metagenomic units) derived from whole-genome sequencing reads.
Ma, Zhanshan Sam.
Afiliação
  • Ma ZS; Faculty of Arts and Sciences Harvard University Cambridge, MA, 02138, USA; Microbiome Medicine and Advanced AI Lab, Cambridge, MA, 02138, USA; Computational Biology and Medical Ecology Lab Kunming Institute of Zoology Chinese Academy of Sciences, China. Electronic address: zhanshanma@fas.harvard.edu.
Comput Biol Med ; 180: 108852, 2024 Aug 12.
Article em En | MEDLINE | ID: mdl-39137667
ABSTRACT

BACKGROUND:

Current methods for comparing metagenomes, derived from whole-genome sequencing reads, include top-down metrics or parametric models such as metagenome-diversity, and bottom-up, non-parametric, model-free machine learning approaches like Naïve Bayes for k-mer-profiling. However, both types are limited in their ability to effectively and comprehensively identify and catalogue unique or enriched metagenomic genes, a critical task in comparative metagenomics. This challenge is significant and complex due to its NP-hard nature, which means computational time grows exponentially, or even faster, with the problem size, rendering it impractical for even the fastest supercomputers without heuristic approximation algorithms.

METHOD:

In this study, we introduce a new framework, MC (Metagenome-Comparison), designed to exhaustively detect and catalogue unique or enriched metagenomic genes (MGs) and their derivatives, including metagenome functional gene clusters (MFGC), or more generally, the operational metagenomic unit (OMU) that can be considered the counterpart of the OTU (operational taxonomic unit) from amplicon sequencing reads. The MC is essentially a heuristic search algorithm guided by pairs of new metrics (termed MG-specificity or OMU-specificity, MG-specificity diversity or OMU-specificity diversity). It is further constrained by statistical significance (P-value) implemented as a pair of statistical tests.

RESULTS:

We evaluated the MC using large metagenomic datasets related to obesity, diabetes, and IBD, and found that the proportions of unique and enriched metagenomic genes ranged from 0.001% to 0.08 % and 0.08%-0.82 % respectively, and less than 10 % for the MFGC.

CONCLUSION:

The MC provides a robust method for comparing metagenomes at various scales, from baseline MGs to various function/pathway clusters of metagenomes, collectively termed OMUs.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Comput Biol Med Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Comput Biol Med Ano de publicação: 2024 Tipo de documento: Article
...