Your browser doesn't support javascript.
loading
Microbiome Preprocessing Machine Learning Pipeline.
Jasner, Yoel; Belogolovski, Anna; Ben-Itzhak, Meirav; Koren, Omry; Louzoun, Yoram.
Afiliación
  • Jasner Y; Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel.
  • Belogolovski A; Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel.
  • Ben-Itzhak M; Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel.
  • Koren O; Azrieli Faculty of Medicine, Bar-Ilan University, Ramat Gan, Israel.
  • Louzoun Y; Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel.
Front Immunol ; 12: 677870, 2021.
Article en En | MEDLINE | ID: mdl-34220823
ABSTRACT

Background:

16S sequencing results are often used for Machine Learning (ML) tasks. 16S gene sequences are represented as feature counts, which are associated with taxonomic representation. Raw feature counts may not be the optimal representation for ML.

Methods:

We checked multiple preprocessing steps and tested the optimal combination for 16S sequencing-based classification tasks. We computed the contribution of each step to the accuracy as measured by the Area Under Curve (AUC) of the classification.

Results:

We show that the log of the feature counts is much more informative than the relative counts. We further show that merging features associated with the same taxonomy at a given level, through a dimension reduction step for each group of bacteria improves the AUC. Finally, we show that z-scoring has a very limited effect on the results.

Conclusions:

The prepossessing of microbiome 16S data is crucial for optimal microbiome based Machine Learning. These preprocessing steps are integrated into the MIPMLP - Microbiome Preprocessing Machine Learning Pipeline, which is available as a stand-alone version at https//github.com/louzounlab/microbiome/tree/master/Preprocess or as a service at http//mip-mlp.math.biu.ac.il/Home Both contain the code, and standard test sets.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Bacterias / Colitis / Mucositis / Microbioma Gastrointestinal / Aprendizaje Automático Tipo de estudio: Prognostic_studies Límite: Adult / Animals / Female / Humans / Pregnancy Idioma: En Revista: Front Immunol Año: 2021 Tipo del documento: Article País de afiliación: Israel

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Bacterias / Colitis / Mucositis / Microbioma Gastrointestinal / Aprendizaje Automático Tipo de estudio: Prognostic_studies Límite: Adult / Animals / Female / Humans / Pregnancy Idioma: En Revista: Front Immunol Año: 2021 Tipo del documento: Article País de afiliación: Israel