Compare the performance of multiple binary classification models in microbial high-throughput sequencing datasets.

Xu, Nuohan; Zhang, Zhenyan; Shen, Yechao; Zhang, Qi; Liu, Zhen; Yu, Yitian; Wang, Yan; Lei, Chaotang; Ke, Mingjing; Qiu, Danyan; Lu, Tao; Chen, Yiling; Xiong, Juntao; Qian, Haifeng

Xu, Nuohan; Zhang, Zhenyan; Shen, Yechao; Zhang, Qi; Liu, Zhen; Yu, Yitian; Wang, Yan; Lei, Chaotang; Ke, Mingjing; Qiu, Danyan; Lu, Tao; Chen, Yiling; Xiong, Juntao; Qian, Haifeng.

Afiliação

Xu N; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Zhang Z; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Shen Y; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Zhang Q; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Liu Z; College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510642, PR China.
Yu Y; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Wang Y; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Lei C; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Ke M; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Qiu D; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Lu T; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China.
Chen Y; Institute of Environmental and Ecological Engineering, Guangdong University of Technology, Guangzhou, 510006, PR China.
Xiong J; College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510642, PR China.
Qian H; College of Environment, Zhejiang University of Technology, Hangzhou, Zhejiang 310032, PR China. Electronic address: hfqian@zjut.edu.cn.

Sci Total Environ ; 837: 155807, 2022 Sep 01.

Article em En | MEDLINE | ID: mdl-35537509

RESUMO

The development of machine learning and deep learning provided solutions for predicting microbiota response on environmental change based on microbial high-throughput sequencing. However, there were few studies specifically clarifying the performance and practical of two types of binary classification models to find a better algorithm for the microbiota data analysis. Here, for the first time, we evaluated the performance, accuracy and running time of the binary classification models built by three machine learning methods - random forest (RF), support vector machine (SVM), logistic regression (LR), and one deep learning method - back propagation neural network (BPNN). The built models were based on the microbiota datasets that removed low-quality variables and solved the class imbalance problem. Additionally, we optimized the models by tuning. Our study demonstrated that dataset pre-processing was a necessary process for model construction. Among these 4 binary classification models, BPNN and RF were the most suitable methods for constructing microbiota binary classification models. Using these 4 models to predict multiple microbial datasets, BPNN showed the highest accuracy and the most robust performance, while the RF method was ranked second. We also constructed the optimal models by adjusting the epochs of BPNN and the n_estimators of RF for six times. The evaluation related to performances of models provided a road map for the application of artificial intelligence to assess microbial ecology.

Assuntos

Inteligência Artificial; Redes Neurais de Computação; Algoritmos; Sequenciamento de Nucleotídeos em Larga Escala; Aprendizado de Máquina; Máquina de Vetores de Suporte

Palavras-chave

Deep learning; Ecotoxicology; Machine learning; Metadata analysis; Microbiota

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Redes Neurais de Computação Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google