MKELM based multi-classification model for foreign accent identification.

Kashif, Kaleem; Alwan, Abeer; Wu, Yizhi; De Nardis, Luca; Di Benedetto, Maria-Gabriella

Kashif, Kaleem; Alwan, Abeer; Wu, Yizhi; De Nardis, Luca; Di Benedetto, Maria-Gabriella.

Afiliação

Kashif K; Department of Information Engineering, Electronics and Telecommunication, Sapienza University Rome, Rome, 00184, Italy.
Alwan A; Electrical and Computer Engineering Department, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Wu Y; Information Science & Technology, Donghua University, Shanghai, 201620, PR China.
De Nardis L; Department of Information Engineering, Electronics and Telecommunication, Sapienza University Rome, Rome, 00184, Italy.
Di Benedetto MG; Department of Information Engineering, Electronics and Telecommunication, Sapienza University Rome, Rome, 00184, Italy.

Heliyon ; 10(16): e36460, 2024 Aug 30.

Article em En | MEDLINE | ID: mdl-39262941

ABSTRACT

ABSTRACT

The automatic identification of foreign accents can play a crucial role in various speech systems, including speaker identification, e-learning, telephone banking, and more. Additionally, it can greatly enhance the robustness of Automatic Speech Recognition (ASR) systems. Non-native accents in speech signals are characterized by distinct pronunciations, prosody, and voice characteristics of the speaker. However, automatically identifying foreign accents poses significant challenges, particularly in the context of multi-class modeling. Multi-classification models face difficulties in achieving high performance and dealing with computational challenges when confronted with multi-dimensional and unbalanced datasets, such as those with more than two accents. Furthermore, the choice of features remains a bottleneck problem for Foreign Accent Identification (FAID), further hindering performance in these tasks. Consequently, the accuracy of current systems is typically low. To address these challenges, this paper proposes a framework based on the Multi-Kernel Extreme Learning Machine (MKELM) model for the multi-classification of FAID. The MKELM model utilizes a novel weighted scheme to classify various non-native English accents, including Arabic, Chinese, Korean, French, and Spanish. The model first combines Mel-frequency cepstral coefficients (MFCCs) and prosodic features as input, trains pairwise binary classifiers independently, and subsequently employs a weighting scheme to distinguish between classes and identify accents. Through experiments, the proposed model achieves an accuracy rate of 84.72% using a paired weighting scheme. In contrast, the accuracy rate drops to 66.5% when employing the traditional non-weighted multi-classification scheme. A comparison with other models demonstrates the significant advantages of the proposed model in FAID multi-class classification, showcasing improved accuracy, reduced computational complexity (requiring fewer computations, faster learning rates, and shorter training time), and enhanced stability compared to state-of-the-art classification methods.

Palavras-chave

Foreign accent identification (FAID); Multi-kernel extreme learning machine (MKELM); Weighted classification scheme (WCS)

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article