Augmentation strategies for an imbalanced learning problem on a novel COVID-19 severity dataset.

Schaudt, Daniel; von Schwerin, Reinhold; Hafner, Alexander; Riedel, Pascal; Reichert, Manfred; von Schwerin, Marianne; Beer, Meinrad; Kloth, Christopher

Schaudt, Daniel; von Schwerin, Reinhold; Hafner, Alexander; Riedel, Pascal; Reichert, Manfred; von Schwerin, Marianne; Beer, Meinrad; Kloth, Christopher.

Afiliação

Schaudt D; Department of Computer Science, Ulm University of Applied Science, Albert-Einstein-Allee 55, 89081, Ulm, Baden-Wurttemberg, Germany. daniel.schaudt@thu.de.
von Schwerin R; Department of Computer Science, Ulm University of Applied Science, Albert-Einstein-Allee 55, 89081, Ulm, Baden-Wurttemberg, Germany.
Hafner A; Department of Computer Science, Ulm University of Applied Science, Albert-Einstein-Allee 55, 89081, Ulm, Baden-Wurttemberg, Germany.
Riedel P; Department of Computer Science, Ulm University of Applied Science, Albert-Einstein-Allee 55, 89081, Ulm, Baden-Wurttemberg, Germany.
Reichert M; Institute of Databases and Information Systems, Ulm University, James-Franck-Ring, 89081, Ulm, Baden-Wurttemberg, Germany.
von Schwerin M; Department of Computer Science, Ulm University of Applied Science, Albert-Einstein-Allee 55, 89081, Ulm, Baden-Wurttemberg, Germany.
Beer M; Department of Radiology, University Hospital of Ulm, Albert-Einstein-Allee 23, 89081, Ulm, Baden-Wurttemberg, Germany.
Kloth C; Department of Radiology, University Hospital of Ulm, Albert-Einstein-Allee 23, 89081, Ulm, Baden-Wurttemberg, Germany.

Sci Rep ; 13(1): 18299, 2023 10 25.

Article em En | MEDLINE | ID: mdl-37880333

ABSTRACT

ABSTRACT

Since the beginning of the COVID-19 pandemic, many different machine learning models have been developed to detect and verify COVID-19 pneumonia based on chest X-ray images. Although promising, binary models have only limited implications for medical treatment, whereas the prediction of disease severity suggests more suitable and specific treatment options. In this study, we publish severity scores for the 2358 COVID-19 positive images in the COVIDx8B dataset, creating one of the largest collections of publicly available COVID-19 severity data. Furthermore, we train and evaluate deep learning models on the newly created dataset to provide a first benchmark for the severity classification task. One of the main challenges of this dataset is the skewed class distribution, resulting in undesirable model performance for the most severe cases. We therefore propose and examine different augmentation strategies, specifically targeting majority and minority classes. Our augmentation strategies show significant improvements in precision and recall values for the rare and most severe cases. While the models might not yet fulfill medical requirements, they serve as an appropriate starting point for further research with the proposed dataset to optimize clinical resource allocation and treatment.

Assuntos

COVID-19; Pandemias; Humanos; Benchmarking; Aprendizado de Máquina; Rememoração Mental

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Pandemias / COVID-19 Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Pandemias / COVID-19 Idioma: En Ano de publicação: 2023 Tipo de documento: Article