Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos.

Döllinger, Michael; Schraut, Tobias; Henrich, Lea A; Chhetri, Dinesh; Echternach, Matthias; Johnson, Aaron M; Kunduk, Melda; Maryn, Youri; Patel, Rita R; Samlan, Robin; Semmler, Marion; Schützenberger, Anne

Döllinger, Michael; Schraut, Tobias; Henrich, Lea A; Chhetri, Dinesh; Echternach, Matthias; Johnson, Aaron M; Kunduk, Melda; Maryn, Youri; Patel, Rita R; Samlan, Robin; Semmler, Marion; Schützenberger, Anne.

Afiliación

Döllinger M; Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany.
Schraut T; Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany.
Henrich LA; Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany.
Chhetri D; Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, CA 90095, USA.
Echternach M; Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), 80331 Munich, Germany.
Johnson AM; NYU Voice Center, Department of Otolaryngology-Head and Neck Surgery, New York University, Grossman School of Medicine, New York, NY 10001, USA.
Kunduk M; Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge, LA 70801, USA.
Maryn Y; Department of Speech, Language and Hearing Sciences, University of Ghent, 9000 Ghent, Belgium.
Patel RR; Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IA 47401, USA.
Samlan R; Department of Speech, Language, & Hearing Sciences, University of Arizona, Tucson, AZ 85641, USA.
Semmler M; Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany.
Schützenberger A; Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany.

Appl Sci (Basel) ; 12(19)2022 Oct.

Article en En | MEDLINE | ID: mdl-37583544

RESUMEN

Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting "concepts shifts" for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge.

Palabras clave

catastrophic forgetting; concept shifts; convolutional neural networks; finetuning; glottis; high-speed imaging; medical image segmentation; re-training; voice

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Bases de datos: MEDLINE Idioma: En Revista: Appl Sci (Basel) Año: 2022 Tipo del documento: Article País de afiliación: Alemania

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Bases de datos: MEDLINE Idioma: En Revista: Appl Sci (Basel) Año: 2022 Tipo del documento: Article País de afiliación: Alemania