Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment.

Marcos-Zambrano, Laura Judith; Karaduzovic-Hadziabdic, Kanita; Loncar Turukalo, Tatjana; Przymus, Piotr; Trajkovik, Vladimir; Aasmets, Oliver; Berland, Magali; Gruca, Aleksandra; Hasic, Jasminka; Hron, Karel; Klammsteiner, Thomas; Kolev, Mikhail; Lahti, Leo; Lopes, Marta B; Moreno, Victor; Naskinova, Irina; Org, Elin; Paciência, Inês; Papoutsoglou, Georgios; Shigdel, Rajesh; Stres, Blaz; Vilne, Baiba; Yousef, Malik; Zdravevski, Eftim; Tsamardinos, Ioannis; Carrillo de Santa Pau, Enrique; Claesson, Marcus J; Moreno-Indias, Isabel; Truu, Jaak

Marcos-Zambrano, Laura Judith; Karaduzovic-Hadziabdic, Kanita; Loncar Turukalo, Tatjana; Przymus, Piotr; Trajkovik, Vladimir; Aasmets, Oliver; Berland, Magali; Gruca, Aleksandra; Hasic, Jasminka; Hron, Karel; Klammsteiner, Thomas; Kolev, Mikhail; Lahti, Leo; Lopes, Marta B; Moreno, Victor; Naskinova, Irina; Org, Elin; Paciência, Inês; Papoutsoglou, Georgios; Shigdel, Rajesh; Stres, Blaz; Vilne, Baiba; Yousef, Malik; Zdravevski, Eftim; Tsamardinos, Ioannis; Carrillo de Santa Pau, Enrique; Claesson, Marcus J; Moreno-Indias, Isabel; Truu, Jaak.

Afiliação

Marcos-Zambrano LJ; Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain.
Karaduzovic-Hadziabdic K; Faculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina.
Loncar Turukalo T; Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia.
Przymus P; Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Torun, Poland.
Trajkovik V; Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, North Macedonia.
Aasmets O; Institute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia.
Berland M; Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.
Gruca A; Université Paris-Saclay, INRAE, MGP, Jouy-en-Josas, France.
Hasic J; Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, Poland.
Hron K; University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina.
Klammsteiner T; Department of Mathematical Analysis and Applications of Mathematics, Palacký University, Olomouc, Czechia.
Kolev M; Department of Microbiology, University of Innsbruck, Innsbruck, Austria.
Lahti L; South West University "Neofit Rilski", Blagoevgrad, Bulgaria.
Lopes MB; Department of Computing, University of Turku, Turku, Finland.
Moreno V; NOVA Laboratory for Computer Science and Informatics (NOVA LINCS), FCT, UNL, Caparica, Portugal.
Naskinova I; Centro de Matemática e Aplicações (CMA), FCT, UNL, Caparica, Portugal.
Org E; Oncology Data Analytics Program, Catalan Institute of Oncology (ICO) Barcelona, Spain.
Paciência I; Colorectal Cancer Group, Institut de Recerca Biomedica de Bellvitge (IDIBELL), Barcelona, Spain.
Papoutsoglou G; Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Barcelona, Spain.
Shigdel R; Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain.
Stres B; South West University "Neofit Rilski", Blagoevgrad, Bulgaria.
Vilne B; Institute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia.
Yousef M; EPIUnit - Instituto de Saúde Pública da Universidade do Porto, Porto, Portugal.
Zdravevski E; Department of Computer Science, University of Crete, Heraklion, Greece.
Tsamardinos I; Department of Clinical Science, University of Bergen, Bergen, Norway.
Carrillo de Santa Pau E; Group for Microbiology and Microbial Biotechnology, Department of Animal Science, University of Ljubljana, Ljubljana, Slovenia.
Claesson MJ; Bioinformatics Research Unit, Riga Stradins University, Riga, Latvia.
Moreno-Indias I; Department of Information Systems, Zefat Academic College, Zefat, Israel.
Truu J; Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel.

Front Microbiol ; 12: 634511, 2021.

Article em En | MEDLINE | ID: mdl-33737920

RESUMO

The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.

Palavras-chave

biomarker identification; disease prediction; feature selection; machine learning; microbiome

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Diagnostic_studies / Guideline / Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google