Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
J Voice ; 27(1): 11-23, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23146720

RESUMO

OBJECTIVES: This article presents a comparative study of the spectral power distribution for normal and dysphonic voices, both for sustained vowels and running speech. The objective of this study was to find robust cues of dysphonia in spectral domain. For this purpose, recordings from two databases are processed, one of them including both sustained vowels and running speech. Additionally, a new measure of stability is introduced (decorrelation time). The application of this measure to the power spectrum is also tested as a cue of dysphonia. MATERIALS AND METHODS: The spectral analysis is done having both an auditory model and the filterbank approach as references to the computation of discrete spectrograms. Results are obtained from three sets of recordings belonging to two different databases. RESULTS: The reported results indicate that only minor differences exist in the shape of the power spectrum of normal and dysphonic voices when performing sustained vowel phonation tasks. However, the calculated band power decorrelation times indicate that power in bands between 2000 and 6400Hz is significantly less stable in dysphonic voices. As for running speech, the stability of spectral power is not such a good indicator of dysphonia, but there is a significant difference between normal and dysphonic voices in the power level of high-frequency bands (above 5300Hz). In addition, this means that sampling rates above 10.6ksps are needed for assessing running speech in spectral domain. Also, the results involving decorrelation times indicate that for short-time spectral analysis, frame rates above 100 frames/s should be preferred.


Assuntos
Disfonia/diagnóstico , Acústica da Fala , Adulto , Algoritmos , Estudos de Viabilidade , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
2.
IEEE Trans Biomed Eng ; 58(2): 370-9, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21257362

RESUMO

This paper proposes a new approach to improve the amount of information extracted from the speech aiming to increase the accuracy of a system developed for the automatic detection of pathological voices. The paper addresses the discrimination capabilities of 11 features extracted using nonlinear analysis of time series. Two of these features are based on conventional nonlinear statistics (largest Lyapunov exponent and correlation dimension), two are based on recurrence and fractal-scaling analysis, and the remaining are based on different estimations of the entropy. Moreover, this paper uses a strategy based on combining classifiers for fusing the nonlinear analysis with the information provided by classic parameterization approaches found in the literature (noise parameters and mel-frequency cepstral coefficients). The classification was carried out in two steps using, first, a generative and, later, a discriminative approach. Combining both classifiers, the best accuracy obtained is 98.23% ± 0.001.


Assuntos
Inteligência Artificial , Processamento de Sinais Assistido por Computador , Distúrbios da Voz/diagnóstico , Algoritmos , Humanos , Cadeias de Markov , Dinâmica não Linear , Distribuição Normal , Espectrografia do Som/métodos , Distúrbios da Voz/fisiopatologia
3.
Logoped Phoniatr Vocol ; 36(2): 52-9, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20849245

RESUMO

Within this paper, the authors report on an experiment on automatic labelling of perceived voice roughness (R) and breathiness (B), according to the GRBAS scale. The main objective of the experiment has not been to correlate objective measures to perceived R and B, but to automatically evaluate R and B. For this purpose, a system has been trained that extracts the first mel-frequency cepstral coefficients (MFCC) of available sustained vowel phonations. Afterwards, a classifier has been trained to estimate the corresponding degrees of roughness and breathiness. The obtained results reveal a significant correlation between subjective and automatic labelling, hence indicating the feasibility of objective evaluation of voice quality by means of perceptually meaningful measures.


Assuntos
Processamento de Sinais Assistido por Computador , Medida da Produção da Fala , Distúrbios da Voz/diagnóstico , Qualidade da Voz , Adulto , Algoritmos , Automação , Bases de Dados como Assunto , Estudos de Viabilidade , Feminino , Análise de Fourier , Humanos , Masculino , Pessoa de Meia-Idade , Reconhecimento Automatizado de Padrão , Fonação , Valor Preditivo dos Testes , Espectrografia do Som , Acústica da Fala , Distúrbios da Voz/fisiopatologia
4.
J Voice ; 24(1): 47-56, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19135854

RESUMO

This paper evaluates the capabilities of the Glottal to Noise Excitation Ratio for the screening of voice disorders. A lot of effort has been made using this parameter to evaluate voice quality, but there do not exist any studies that evaluate the discrimination capabilities of this acoustic parameter to classify between normal and pathological voices, and neither are there any previous studies that reflect the normative values that could be used for screening purposes. A set of 226 speakers (53 normal and 173 pathological) taken from a voice disorders database were used to evaluate the usefulness of this parameter for discriminating normal and pathological voices. To evaluate this parameter, the effect of the bandwidth of the Hilbert envelopes and the frequency shift have been analyzed, concluding that a good discrimination is obtained with a bandwidth of 1000 Hz and a frequency shift of 300 Hz. The results confirm that the Glottal to Noise Excitation Ratio provides reliable measurements in terms of discrimination among normal and pathological voices, comparable to other classical long-term noise measurements found in the literature, such as Normalized Noise Energy or Harmonics to Noise Ratio, so this parameter can be considered a good choice for screening purposes.


Assuntos
Glote/fisiopatologia , Ruído , Acústica da Fala , Distúrbios da Voz/diagnóstico , Distúrbios da Voz/fisiopatologia , Adulto , Algoritmos , Área Sob a Curva , Bases de Dados como Assunto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Curva ROC , Caracteres Sexuais , Voz , Adulto Jovem
5.
Artigo em Inglês | MEDLINE | ID: mdl-19965158

RESUMO

In this work an entropy based nonlinear analysis of pathological voices is presented. The complexity analysis is carried out by means of six different entropies, including three measures derived from the entropy rate of Markov chains. The aim is to characterize the divergence of the trajectories and theirs directions into the state space of Markov Chains. By employing these measures in conjunction with conventional entropy features, it is possible to improve the discrimination capabilities of the nonlinear analysis in the automatic detection of pathological voices.


Assuntos
Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Distúrbios da Voz/fisiopatologia , Voz , Acústica , Algoritmos , Automação , Engenharia Biomédica/métodos , Entropia , Humanos , Cadeias de Markov , Modelos Estatísticos , Curva ROC , Fatores de Tempo , Distúrbios da Voz/diagnóstico
6.
Comput Med Imaging Graph ; 32(3): 193-201, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18243657

RESUMO

The present work describes a new method for the automatic detection of the glottal space from laryngeal images obtained either with high speed or with conventional video cameras attached to a laryngoscope. The detection is based on the combination of several relevant techniques in the field of digital image processing. The image is segmented with a watershed transform followed by a region merging, while the final decision is taken using a simple linear predictor. This scheme has successfully segmented the glottal space in all the test images used. The method presented can be considered a generalist approach for the segmentation of the glottal space because, in contrast with other methods found in literature, this approach does not need either initialization or finding strict environmental conditions extracted from the images to be processed. Therefore, the main advantage is that the user does not have to outline the region of interest with a mouse click. In any case, some a priori knowledge about the glottal space is needed, but this a priori knowledge can be considered weak compared to the environmental conditions fixed in former works.


Assuntos
Glote/patologia , Processamento de Imagem Assistida por Computador/métodos , Laringoscópios , Laringe/patologia , Gravação em Vídeo , Algoritmos , Automação , Humanos
7.
Eur Arch Otorhinolaryngol ; 265(4): 465-76, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17922287

RESUMO

In this study, two different tools developed for the parametric extraction and acoustic analysis of voice samples are compared. The main goal of the paper is to contrast the results obtained using the classical Multi Dimensional Voice Program (MDVP), with the results obtained with the novel WPCVox. The aim of this comparison was to find differences and similarities in the parameters extracted with both systems in order to make comparison of measurements and data transfer among both equipments. The study was carried out in two stages: in the first, a wide sample of healthy voices belonging to Spanish-speaking adults from both genders were used to carry out a direct comparison between the results given by MDVP and those obtained with WPCVox. In the second stage, a sample of 200 speakers (53 normal and 173 pathological) taken from a commercially available database of voice disorders were used to demonstrate the usefulness of WPCVox for the acoustic analysis and the characterization of normal and pathological voices. The results conclude that WPCVox provides very reliable measurements which are very similar to those obtained using MDVP, and very similar capabilities to discriminate among normal and pathological voices.


Assuntos
Acústica/instrumentação , Distúrbios da Voz/diagnóstico , Qualidade da Voz/fisiologia , Adolescente , Adulto , Idoso , Criança , Desenho de Equipamento , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , Distúrbios da Voz/fisiopatologia
8.
IEEE Trans Biomed Eng ; 55(12): 2831-5, 2008 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19126465

RESUMO

This paper investigates the performance of an automatic system for voice pathology detection when the voice samples have been compressed in MP3 format and different binary rates (160, 96, 64, 48, 24, and 8 kb/s). The detectors employ cepstral and noise measurements, along with their derivatives, to characterize the voice signals. The classification is performed using Gaussian mixtures models and support vector machines. The results between the different proposed detectors are compared by means of detector error tradeoff (DET) and receiver operating characteristic (ROC) curves, concluding that there are no significant differences in the performance of the detector when the binary rates of the compressed data are above 64 kb/s. This has useful applications in telemedicine, reducing the storage space of voice recordings or transmitting them over narrow-band communications channels.


Assuntos
Artefatos , Compressão de Dados/métodos , Espectrografia do Som/métodos , Acústica da Fala , Distúrbios da Voz/diagnóstico , Inteligência Artificial , Análise de Fourier , Humanos , Multimídia , Distribuição Normal , Reconhecimento Automatizado de Padrão/métodos , Curva ROC , Voz , Distúrbios da Voz/fisiopatologia , Qualidade da Voz
9.
Med Eng Phys ; 28(3): 276-89, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15950513

RESUMO

A PC-based integrated aid tool has been developed for the analysis and screening of pathological voices. With it the user can simultaneously record speech, electroglottographic (EGG), and videoendoscopic signals, and synchronously edit them to select the most significant segments. These multimedia data are stored on a relational database, together with a patient's personal information, anamnesis, diagnosis, visits, explorations and any other comment the specialist may wish to include. The speech and EGG waveforms are analysed by means of temporal representations and the quantitative measurements of parameters such as spectrograms, frequency and amplitude perturbation measurements, harmonic energy, noise, etc. are calculated using digital signal processing techniques, giving an idea of the degree of hoarseness and quality of the voice register. Within this framework, the system uses a standard protocol to evaluate and build complete databases of voice disorders. The target users of this system are speech and language therapists and ear nose and throat (ENT) clinicians. The application can be easily configured to cover the needs of both groups of professionals. The software has a user-friendly Windows style interface. The PC should be equipped with standard sound and video capture cards. Signals are captured using common transducers: a microphone, an electroglottograph and a fiberscope or telelaryngoscope. The clinical usefulness of the system is addressed in a comprehensive evaluation section.


Assuntos
Diagnóstico por Computador/métodos , Laringoscopia/métodos , Sistemas Computadorizados de Registros Médicos , Software , Espectrografia do Som/métodos , Interface Usuário-Computador , Distúrbios da Voz/diagnóstico , Gráficos por Computador , Sistemas de Gerenciamento de Base de Dados , Eletroencefalografia/métodos , Armazenamento e Recuperação da Informação/métodos , Design de Software , Integração de Sistemas , Telemedicina/métodos
10.
Conf Proc IEEE Eng Med Biol Soc ; 2006: 2478-81, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17946516

RESUMO

Nowadays, the most extended techniques to measure the voice quality are based on perceptual evaluation by well trained professionals. The GRBAS scale is a widely used method for perceptual evaluation of voice quality. The GRBAS scale is widely used in Japan and there is increasing interest in both Europe and the United States. However, this technique needs well-trained experts, and is based on the evaluator's expertise, depending a lot on his own psycho-physical state. Furthermore, a great variability in the assessments performed from one evaluator to another is observed. Therefore, an objective method to provide such measurement of voice quality would be very valuable. In this paper, the automatic assessment of voice quality is addressed by means of short-term Mel cepstral parameters (MFCC), and learning vector quantization (LVQ) in a pattern recognition stage. Results show that this approach provides acceptable results for this purpose, with accuracy around 65% at the best.


Assuntos
Diagnóstico por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Índice de Gravidade de Doença , Espectrografia do Som/métodos , Medida da Produção da Fala/métodos , Distúrbios da Voz/diagnóstico , Qualidade da Voz , Algoritmos , Inteligência Artificial , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Distúrbios da Voz/classificação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...