Pesquisa | Portal Regional da BVS

Development of Supervised Speaker Diarization System Based on the PyAnnote Audio Processing Library.

Khoma, Volodymyr; Khoma, Yuriy; Brydinskyi, Vitalii; Konovalov, Alexander.

Sensors (Basel) ; 23(4)2023 Feb 13.

Artigo em Inglês | MEDLINE | ID: mdl-36850680

RESUMO

Diarization is an important task when work with audiodata is executed, as it provides a solution to the problem related to the need of dividing one analyzed call recording into several speech recordings, each of which belongs to one speaker. Diarization systems segment audio recordings by defining the time boundaries of utterances, and typically use unsupervised methods to group utterances belonging to individual speakers, but do not answer the question "who is speaking?" On the other hand, there are biometric systems that identify individuals on the basis of their voices, but such systems are designed with the prerequisite that only one speaker is present in the analyzed audio recording. However, some applications involve the need to identify multiple speakers that interact freely in an audio recording. This paper proposes two architectures of speaker identification systems based on a combination of diarization and identification methods, which operate on the basis of segment-level or group-level classification. The open-source PyAnnote framework was used to develop the system. The performance of the speaker identification system was verified through the application of the AMI Corpus open-source audio database, which contains 100 h of annotated and transcribed audio and video data. The research method consisted of four experiments to select the best-performing supervised diarization algorithms on the basis of PyAnnote. The first experiment was designed to investigate how the selection of the distance function between vector embedding affects the reliability of identification of a speaker's utterance in a segment-level classification architecture. The second experiment examines the architecture of cluster-centroid (group-level) classification, i.e., the selection of the best clustering and classification methods. The third experiment investigates the impact of different segmentation algorithms on the accuracy of identifying speaker utterances, and the fourth examines embedding window sizes. Experimental results demonstrated that the group-level approach offered better identification results were compared to the segment-level approach, and the latter had the advantage of real-time processing.

Assuntos

Algoritmos , Biometria , Humanos , Reprodutibilidade dos Testes , Análise por Conglomerados , Bases de Dados Factuais

Advanced Computing Methods for Impedance Plethysmography Data Processing.

Khoma, Volodymyr; Kenyo, Halyna; Kawala-Sterniuk, Aleksandra.

Sensors (Basel) ; 22(6)2022 Mar 08.

Artigo em Inglês | MEDLINE | ID: mdl-35336269

RESUMO

In this paper we are introducing innovative solutions applied in impedance plethysmography concerning improvement of the rheagraph characteristics and the efficiency increase of the developing rheograms using computer methods. The described methods have been developed in order to ensure the stability of parameters and to extend the functionality of the rheographic system based on digital signal processing, which applies to the compensation of the base resistance with a digital potentiometer, digital synthesis of quadrature excitation signals and the performance of digital synchronous detection. The emphasis was put on methods for determination of hemodynamic parameters by computer processing of the rheograms. As a result-three methods for respiratory artifacts elimination have been proposed: based on the discrete cosine transform, the discrete wavelet transform and the approximation of the zero line with spline functions. Additionally, computer methods for physiological indicators determination, including those based on wavelet decomposition, were also proposed and described in this paper. The efficiency of various rheogram compression algorithms was tested, evaluated and presented in this work.

Assuntos

Compressão de Dados , Processamento de Sinais Assistido por Computador , Algoritmos , Pletismografia de Impedância , Análise de Ondaletas

ECG Signal as Robust and Reliable Biometric Marker: Datasets and Algorithms Comparison.

Pelc, Mariusz; Khoma, Yuriy; Khoma, Volodymyr.

Sensors (Basel) ; 19(10)2019 May 22.

Artigo em Inglês | MEDLINE | ID: mdl-31121807

RESUMO

In this paper, the possibility of using the ECG signal as an unequivocal biometric marker for authentication and identification purposes has been presented. Furthermore, since the ECG signal was acquired from 4 sources using different measurement equipment, electrodes positioning and number of patients as well as the duration of the ECG record acquisition, we have additionally provided an estimation of the extent of information available in the ECG record. To provide a more objective assessment of the credibility of the identification method, some selected machine learning algorithms were used in two combinations: with and without compression. The results that we have obtained confirm that the ECG signal can be acclaimed as a valid biometric marker that is very robust to hardware variations, noise and artifacts presence, that is stable over time and that is scalable across quite a solid (~100) number of users. Our experiments indicate that the most promising algorithms for ECG identification are LDA, KNN and MLP algorithms. Moreover, our results show that PCA compression, used as part of data preprocessing, does not only bring any noticeable benefits but in some cases might even reduce accuracy.

Assuntos

Algoritmos , Eletrocardiografia , Biomarcadores/análise , Análise Discriminante , Humanos , Modelos Logísticos , Análise de Componente Principal , Processamento de Sinais Assistido por Computador

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA