Lightweight Visual Transformers Outperform Convolutional Neural Networks for Gram-Stained Image Classification: An Empirical Study.

Kim, Hee E; Maros, Mate E; Miethke, Thomas; Kittel, Maximilian; Siegel, Fabian; Ganslandt, Thomas

Kim, Hee E; Maros, Mate E; Miethke, Thomas; Kittel, Maximilian; Siegel, Fabian; Ganslandt, Thomas.

Afiliação

Kim HE; Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany.
Maros ME; Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany.
Miethke T; Institute of Medical Microbiology and Hygiene, Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany.
Kittel M; Institute for Clinical Chemistry, Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany.
Siegel F; Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany.
Ganslandt T; Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.

Biomedicines ; 11(5)2023 Apr 30.

Article em En | MEDLINE | ID: mdl-37239004

ABSTRACT

ABSTRACT

We aimed to automate Gram-stain analysis to speed up the detection of bacterial strains in patients suffering from infections. We performed comparative analyses of visual transformers (VT) using various configurations including model size (small vs. large), training epochs (1 vs. 100), and quantization schemes (tensor- or channel-wise) using float32 or int8 on publicly available (DIBaS, n = 660) and locally compiled (n = 8500) datasets. Six VT models (BEiT, DeiT, MobileViT, PoolFormer, Swin and ViT) were evaluated and compared to two convolutional neural networks (CNN), ResNet and ConvNeXT. The overall overview of performances including accuracy, inference time and model size was also visualized. Frames per second (FPS) of small models consistently surpassed their large counterparts by a factor of 1-2×. DeiT small was the fastest VT in int8 configuration (6.0 FPS). In conclusion, VTs consistently outperformed CNNs for Gram-stain classification in most settings even on smaller datasets.

Palavras-chave

Gram-stain analysis; classification; convolutional neural network; deep learning; quantization; vision transformer

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: Biomedicines Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Alemanha

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google