A Voice Disease Detection Method Based on MFCCs and Shallow CNN.

Xie, Xiaoping; Cai, Hao; Li, Can; Wu, Yu; Ding, Fei

Xie, Xiaoping; Cai, Hao; Li, Can; Wu, Yu; Ding, Fei.

Affiliation

Xie X; The State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China; Shenzhen Research Institute of Hunan University, Shenzhen, China.
Cai H; The State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China. Electronic address: haocai@hnu.edu.cn.
Li C; The State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China.
Wu Y; The Department of Otolaryngology Head and Neck Surgery, Key Laboratory of Otolaryngology for Major Diseases of Hunan Province, Changsha, China.
Ding F; The State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China.

J Voice ; 2023 Oct 25.

Article in En | MEDLINE | ID: mdl-37891129

ABSTRACT

ABSTRACT

The incidence rate of voice diseases is increasing year by year. The use of software for remote diagnosis is a technical development trend and has important practical value. Among voice diseases, common diseases that cause hoarseness include spasmodic dysphonia, vocal cord paralysis, vocal nodule, and vocal cord polyp. This paper presents a voice disease detection method that can be applied in a wide range of clinical. We cooperated with Xiangya Hospital of Central South University to collect voice samples from 352 different patients. The Mel Frequency Cepstrum Coefficient (MFCC) parameters are extracted as input features to describe the voice in the form of data. An innovative model combining MFCC parameters and single convolution layer CNN is proposed for fast calculation and classification. The highest accuracy we achieved was 92%, it is fully ahead of the original research results and internationally advanced. And we use advanced voice function assessment databases (AVFAD) to evaluate the generalization ability of the method we proposed, which achieved an accuracy rate of 98%. Experiments on clinical and standard datasets show that for the pathological detection of voice diseases, our method has greatly improved in accuracy and computational efficiency.

Key words

Deep learning; Disease detection; MFCCs; Pathological voice disorder

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: J Voice Journal subject: OTORRINOLARINGOLOGIA Year: 2023 Document type: Article Affiliation country: China

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google