Machine learning based estimation of hoarseness severity using sustained vowelsa).

Schraut, Tobias; Schützenberger, Anne; Arias-Vergara, Tomás; Kunduk, Melda; Echternach, Matthias; Döllinger, Michael

Schraut, Tobias; Schützenberger, Anne; Arias-Vergara, Tomás; Kunduk, Melda; Echternach, Matthias; Döllinger, Michael.

Affiliation

Schraut T; Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.
Schützenberger A; Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.
Arias-Vergara T; Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.
Kunduk M; Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge, Louisiana 70803, USA.
Echternach M; Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Munich, Ludwig-Maximilians-Universität München, 81377 Munich, Germany.
Döllinger M; Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.

J Acoust Soc Am ; 155(1): 381-395, 2024 01 01.

Article de En | MEDLINE | ID: mdl-38240668

ABSTRACT

ABSTRACT

Auditory perceptual evaluation is considered the gold standard for assessing voice quality, but its reliability is limited due to inter-rater variability and coarse rating scales. This study investigates a continuous, objective approach to evaluate hoarseness severity combining machine learning (ML) and sustained phonation. For this purpose, 635 acoustic recordings of the sustained vowel /a/ and subjective ratings based on the roughness, breathiness, and hoarseness scale were collected from 595 subjects. A total of 50 temporal, spectral, and cepstral features were extracted from each recording and used to identify suitable ML algorithms. Using variance and correlation analysis followed by backward elimination, a subset of relevant features was selected. Recordings were classified into two levels of hoarseness, H<2 and H≥2, yielding a continuous probability score y∈[0,1]. An accuracy of 0.867 and a correlation of 0.805 between the model's predictions and subjective ratings was obtained using only five acoustic features and logistic regression (LR). Further examination of recordings pre- and post-treatment revealed high qualitative agreement with the change in subjectively determined hoarseness levels. Quantitatively, a moderate correlation of 0.567 was obtained. This quantitative approach to hoarseness severity estimation shows promising results and potential for improving the assessment of voice quality.

Sujet(s)

Texte intégral

Ajouter à My VHL

Imprimer

XML

PubMed Links

Recherche sur Google

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Enrouement / Dysphonie Type d'étude: Diagnostic_studies / Prognostic_studies / Qualitative_research Limites: Humans Langue: En Journal: J Acoust Soc Am Année: 2024 Type de document: Article Pays d'affiliation: Allemagne

Texte intégral

Ajouter à My VHL

Imprimer

XML

PubMed Links

Recherche sur Google