Your browser doesn't support javascript.
loading
CBEA: Competitive balances for taxonomic enrichment analysis.
Nguyen, Quang P; Hoen, Anne G; Frost, H Robert.
Afiliação
  • Nguyen QP; Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth College, Hanover, New Hampshire, United States of America.
  • Hoen AG; Department of Epidemiology, Geisel School of Medicine at Dartmouth College, Hanover, New Hampshire, United States of America.
  • Frost HR; Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth College, Hanover, New Hampshire, United States of America.
PLoS Comput Biol ; 18(5): e1010091, 2022 05.
Article em En | MEDLINE | ID: mdl-35584140
ABSTRACT
Research in human-associated microbiomes often involves the analysis of taxonomic count tables generated via high-throughput sequencing. It is difficult to apply statistical tools as the data is high-dimensional, sparse, and compositional. An approachable way to alleviate high-dimensionality and sparsity is to aggregate variables into pre-defined sets. Set-based analysis is ubiquitous in the genomics literature and has demonstrable impact on improving interpretability and power of downstream analysis. Unfortunately, there is a lack of sophisticated set-based analysis methods specific to microbiome taxonomic data, where current practice often employs abundance summation as a technique for aggregation. This approach prevents comparison across sets of different sizes, does not preserve inter-sample distances, and amplifies protocol bias. Here, we attempt to fill this gap with a new single-sample taxon enrichment method that uses a novel log-ratio formulation based on the competitive null hypothesis commonly used in the enrichment analysis literature. Our approach, titled competitive balances for taxonomic enrichment analysis (CBEA), generates sample-specific enrichment scores as the scaled log-ratio of the subcomposition defined by taxa within a set and the subcomposition defined by its complement. We provide sample-level significance testing by estimating an empirical null distribution of our test statistic with valid p-values. Herein, we demonstrate, using both real data applications and simulations, that CBEA controls for type I error, even under high sparsity and high inter-taxa correlation scenarios. Additionally, CBEA provides informative scores that can be inputs to downstream analyses such as prediction tasks.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Microbiota Tipo de estudo: Guideline / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Microbiota Tipo de estudo: Guideline / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2022 Tipo de documento: Article