Enhancing speaker identification through reverberation modeling and cancelable techniques using ANNs.

Hassan, Emad S; Neyazi, Badawi; Seddeq, H S; Mahmoud, Adel Zaghloul; Oshaba, Ahmed S; El-Emary, Atef; Abd El-Samie, Fathi E

Hassan, Emad S; Neyazi, Badawi; Seddeq, H S; Mahmoud, Adel Zaghloul; Oshaba, Ahmed S; El-Emary, Atef; Abd El-Samie, Fathi E.

Hassan ES; Department of Electrical Engineering, College of Engineering, Jazan University, Jizan, Saudi Arabia.
Neyazi B; Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf, Egypt.
Seddeq HS; Productivity and Vocational Training Department, Ministry of Industry, Cairo, Egypt.
Mahmoud AZ; Acoustic Laboratory, Housing and Building National Research Center, Giza, Egypt.
Oshaba AS; Electronics and Communications Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt.
El-Emary A; College of Engineering, Deltal University for Science and Technology, Mansoura, Egypt.
Abd El-Samie FE; Department of Electrical Engineering, College of Engineering, Jazan University, Jizan, Saudi Arabia.

PLoS One ; 19(2): e0294235, 2024.

Article en En | MEDLINE | ID: mdl-38354194

ABSTRACT

ABSTRACT

This paper introduces a method aiming at enhancing the efficacy of speaker identification systems within challenging acoustic environments characterized by noise and reverberation. The methodology encompasses the utilization of diverse feature extraction techniques, including Mel-Frequency Cepstral Coefficients (MFCCs) and discrete transforms, such as Discrete Cosine Transform (DCT), Discrete Sine Transform (DST), and Discrete Wavelet Transform (DWT). Additionally, an Artificial Neural Network (ANN) serves as the classifier for this method. Reverberation is modeled using varying-length comb filters, and its impact on pitch frequency estimation is explored via the Auto Correlation Function (ACF). This paper also contributes to the field of cancelable speaker identification in both open and reverberation environments. The proposed method depends on comb filtering at the feature level, deliberately distorting MFCCs. This distortion, incorporated within a cancelable framework, serves to obscure speaker identities, rendering the system resilient to potential intruders. Three systems are presented in this work; a reverberation-affected speaker identification system, a system depending on cancelable features through comb filtering, and a novel cancelable speaker identification system within reverbration environments. The findings revealed that, in both scenarios with and without reverberation effects, the DWT-based features exhibited superior performance within the speaker identification system. Conversely, within the cancelable speaker identification system, the DCT-based features represent the top-performing choice.

Asunto(s)

Redes Neurales de la Computación; Ruido; Acústica; Análisis de Ondículas

Texto completo

Imprimir

XML

PubMed Links

Search on Google

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Redes Neurales de la Computación / Ruido Tipo de estudio: Diagnostic_studies / Prognostic_studies Idioma: En Año: 2024 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Search on Google