Evaluating the Diagnostic Potential of Connected Speech for Benign Laryngeal Disease Using Deep Learning Analysis.

Lee, Jeong Hoon; Seok, Jungirl; Kim, Jae Yeong; Kim, Hee Chan; Kwon, Tack-Kyun

Lee, Jeong Hoon; Seok, Jungirl; Kim, Jae Yeong; Kim, Hee Chan; Kwon, Tack-Kyun.

Afiliação

Lee JH; Department of Radiology, Stanford University School of Medicine, Stanford, California.
Seok J; Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea; Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul, Republic of Korea.
Kim JY; Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea.
Kim HC; Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Republic of Korea.
Kwon TK; Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea; Department of OtorhinolaryngologyHead and Neck Surgery, Boramae Medical Center, Seoul, Republic of Korea. Electronic address: kwontk@snu.ac.kr.

J Voice ; 2024 Feb 12.

Article em En | MEDLINE | ID: mdl-38350806

ABSTRACT

ABSTRACT

OBJECTIVES:

This study aimed to evaluate the performance of artificial intelligence (AI) models using connected speech and vowel sounds in detecting benign laryngeal diseases. STUDY

DESIGN:

Retrospective.

METHODS:

Voice samples from 772 patients, including 502 with normal voices and 270 with vocal cord polyps, cysts, or nodules, were analyzed. We employed deep learning architectures, including convolutional neural networks (CNNs) and time series models, to process the speech data. The primary endpoint was the area under the receiver's operating characteristic curve for binary classification.

RESULTS:

CNN models analyzing speech segments significantly outperformed those using vowel sounds in distinguishing patients with and without benign laryngeal diseases. The best-performing CNN model achieved areas under the receiver operating characteristic curve of 0.895 and 0.845 for speech and vowel sounds, respectively. Correlations between AI-generated disease probabilities and perceptual assessments were more pronounced in the connected-speech analyses. However, the time series models performed worse than the CNNs.

CONCLUSION:

Connected speech analysis is more effective than traditional vowel sound analysis for the diagnosis of laryngeal voice disorders. This study highlights the potential of AI technologies in enhancing the diagnostic capabilities of speech, advocating further exploration, and validation in this field.

Palavras-chave

Artificial intelligence; Connected speech; Convolutional neural networks; Laryngeal voice disorders; Vowel sound

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Diagnostic_studies / Prognostic_studies Idioma: En Revista: J Voice Assunto da revista: OTORRINOLARINGOLOGIA Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google