Pesquisa | BVS Violência e Saúde

Objective assessment of cleft lip and palate speech intelligibility using articulation and hypernasality measures.

Kalita, Sishir; Girish, K S; M, Pushpavathi; Mahadeva Prasanna, S R; Dandapat, S.

J Acoust Soc Am ; 146(2): 1164, 2019 08.

Artigo em Inglês | MEDLINE | ID: mdl-31472592

RESUMO

Assessment of intelligibility is required to characterize the overall speech production capability and to measure the speech outcome of different interventions for individuals with cleft lip and palate (CLP). Researchers have found that articulation error and hypernasality have a significant effect on the degradation of CLP speech intelligibility. Motivated by this finding, the present work proposes an objective measure of sentence-level intelligibility by combining the information of articulation deficits and hypernasality. These two speech disorders represent different aspects of CLP speech. Hence, it is expected that the composite measure based on them may utilize complementary clinical information. The objective scores of articulation and hypernasality are used as features to train a regression model, and the output of the model is considered as the predicted intelligibility score. The Spearman's correlation coefficient based analysis shows a significant correlation between the predicted and perceptual intelligibility scores (ρ = 0.77, p < 0.001).

Assuntos

Fenda Labial/fisiopatologia , Fissura Palatina/fisiopatologia , Cavidade Nasal/fisiologia , Inteligibilidade da Fala , Voz/fisiologia , Criança , Fenda Labial/complicações , Fissura Palatina/complicações , Feminino , Humanos , Masculino , Acústica da Fala

Importance of glottis landmarks for the assessment of cleft lip and palate speech intelligibility.

Kalita, Sishir; Mahadeva Prasanna, S R; Dandapat, S.

J Acoust Soc Am ; 144(5): 2656, 2018 11.

Artigo em Inglês | MEDLINE | ID: mdl-30522275

RESUMO

The present work explores the acoustic characteristics of articulatory deviations near g(lottis) landmarks to derive the correlates of cleft lip and palate speech intelligibility. The speech region around the g landmark is used to compute two different acoustic features, namely, two-dimensional discrete cosine transform based joint spectro-temporal features, and Mel-frequency cepstral coefficients. Sentence-specific acoustic models are built using these features extracted from the normal speakers' group. The mean log-likelihood score for each test utterance is computed and tested as the acoustic correlates of intelligibility. Derived intelligibility measure shows significant correlation (ρ = 0.78, p < 0.001) with the perceptual ratings.

Assuntos

Fenda Labial/fisiopatologia , Glote/anatomia & histologia , Palato/fisiopatologia , Inteligibilidade da Fala/classificação , Algoritmos , Criança , Fenda Labial/complicações , Feminino , Análise de Fourier , Glote/fisiologia , Humanos , Índia/epidemiologia , Masculino , Palato/anormalidades , Acústica da Fala , Distúrbios da Fala/fisiopatologia , Distúrbios da Fala/reabilitação , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Medida da Produção da Fala/métodos

Intelligibility assessment of cleft lip and palate speech using Gaussian posteriograms based on joint spectro-temporal features.

Kalita, Sishir; Mahadeva Prasanna, S R; Dandapat, S.

J Acoust Soc Am ; 144(4): 2413, 2018 10.

Artigo em Inglês | MEDLINE | ID: mdl-30404473

RESUMO

Intelligibility is considered as one of the primary measures for speech rehabilitation of individuals with a cleft lip and palate (CLP). Currently, speech processing and machine-learning-based objective methods are gaining more research interest as a way to quantify speech intelligibility. In this work, joint spectro-temporal features computed from a time-frequency representation of speech are explored to derive speech representations based on Gaussian posteriograms. A comparative framework using dynamic time warping (DTW) is used to quantify the intelligibility of child CLP speech. The DTW distance is used to score sentence-level intelligibility and tested for correlation with perceptual intelligibility ratings obtained from expert speech-language pathologists. A baseline DTW system using the conventional Mel-frequency cepstral coefficients (MFCCs) is also developed to compare the performance of the proposed system. Spearman's rank correlation coefficient between the objective intelligibility scores and the perceptual intelligibility rating is studied. A Williams significance test is conducted to assess the statistical significance of the correlation difference between the methods. The results show that the system based on joint spectro-temporal features significantly outperforms the MFCC-based system.

Acoustic analysis of misarticulated trills in cleft lip and palate children.

Vikram, C M; Macha, Sashank Kumar; Kalita, Sishir; Mahadeva Prasanna, S R.

J Acoust Soc Am ; 143(6): EL474, 2018 06.

Artigo em Inglês | MEDLINE | ID: mdl-29960457

RESUMO

In this paper, acoustic analysis of misarticulated trills in cleft lip and palate speakers is carried out using excitation source based features: strength of excitation and fundamental frequency, derived from zero-frequency filtered signal, and vocal tract system features: first formant frequency (F1) and trill frequency, derived from the linear prediction analysis and autocorrelation approach, respectively. These features are found to be statistically significant while discriminating normal from misarticulated trills. Using acoustic features, dynamic time warping based trill misarticulation detection system is demonstrated. The performance of the proposed system in terms of the F1-score is 73.44%, whereas that for conventional Mel-frequency cepstral coefficients is 66.11%.

Assuntos

Acústica , Fenda Labial/fisiopatologia , Fissura Palatina/fisiopatologia , Acústica da Fala , Medida da Produção da Fala/métodos , Qualidade da Voz , Fatores Etários , Criança , Fenda Labial/diagnóstico , Fissura Palatina/diagnóstico , Feminino , Humanos , Masculino , Processamento de Sinais Assistido por Computador , Espectrografia do Som , Fatores de Tempo

Exploring different attributes of source information for speaker verification with limited test data.

Das, Rohan Kumar; Mahadeva Prasanna, S R.

J Acoust Soc Am ; 140(1): 184, 2016 07.

Artigo em Inglês | MEDLINE | ID: mdl-27475144

RESUMO

This work explores mel power difference of spectrum in subband, residual mel frequency cepstral coefficient, and discrete cosine transform of the integrated linear prediction residual for speaker verification under limited test data conditions. These three source features are found to capture different attributes of source information, namely, periodicity, smoothed spectrum information, and shape of the glottal signal, respectively. On the NIST SRE 2003 database, the proposed combination of the three source features performs better [equal error rate (EER): 20.19%, decision cost function (DCF): 0.3759] than the mel frequency cepstral coefficient feature (EER: 22.31%, DCF: 0.4128) for 2 s duration of test segments.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA