Pesquisa | BVS Economia da Saúde

1.

Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition.

Ghorbani, Shahram; Hansen, John H L.

J Acoust Soc Am ; 155(6): 3848-3860, 2024 Jun 01.

Artigo em Inglês | MEDLINE | ID: mdl-38884524

RESUMO

The ability to accurately classify accents and assess accentedness in non-native speakers are challenging tasks due primarily to the complexity and diversity of accent and dialect variations. In this study, embeddings from advanced pretrained language identification (LID) and speaker identification (SID) models are leveraged to improve the accuracy of accent classification and non-native accentedness assessment. Findings demonstrate that employing pretrained LID and SID models effectively encodes accent/dialect information in speech. Furthermore, the LID and SID encoded accent information complement an end-to-end (E2E) accent identification (AID) model trained from scratch. By incorporating all three embeddings, the proposed multi-embedding AID system achieves superior accuracy in AID. Next, leveraging automatic speech recognition (ASR) and AID models is investigated to explore accentedness estimation. The ASR model is an E2E connectionist temporal classification model trained exclusively with American English (en-US) utterances. The ASR error rate and en-US output of the AID model are leveraged as objective accentedness scores. Evaluation results demonstrate a strong correlation between scores estimated by the two models. Additionally, a robust correlation between objective accentedness scores and subjective scores based on human perception is demonstrated, providing evidence for the reliability and validity of using AID-based and ASR-based systems for accentedness assessment in non-native speech. Such advanced systems would benefit accent assessment in language learning as well as speech and speaker assessment for intelligibility, quality, and speaker diarization and speech recognition advancements.

Assuntos

Percepção da Fala , Interface para o Reconhecimento da Fala , Humanos , Percepção da Fala/fisiologia , Acústica da Fala , Fonética , Idioma , Medida da Produção da Fala/métodos , Feminino , Masculino

2.

Hierarchical Temporal Structuring of Speech: A Multiscale, Multimodal Framework to Inform the Assessment and Management of Neuromotor Speech Disorder.

Rong, Panying; Heidrick, Lindsey.

J Speech Lang Hear Res ; 67(1): 92-115, 2024 Jan 08.

Artigo em Inglês | MEDLINE | ID: mdl-38099851

RESUMO

PURPOSE: Hierarchical temporal structuring of speech is the key to multiscale linguistic information transfer toward effective communication. This study investigated and linked the hierarchical temporal cues of the kinematic and acoustic modalities of natural, unscripted speech in neurologically healthy and impaired speakers. METHOD: Thirteen individuals with amyotrophic lateral sclerosis (ALS) and 10 age-matched healthy controls performed a story-telling task. The hierarchical temporal structure of the speech stimulus was measured by (a) 26 articulatory-kinematic features characterizing the depth, phase synchronization, and coherence of temporal modulation of the tongue tip, tongue body, lower lip, and jaw, at three hierarchically nested timescales corresponding to prosodic stress, syllables, and onset-rime/phonemes, and (b) 25 acoustic features characterizing the parallel aspects of temporal modulation of five critical-spectral-band envelopes. All features were compared between groups. For each aspect of temporal modulation, the contributions of all articulatory features to the parallel acoustic features were evaluated by group. RESULTS: Generally consistent disease impacts were identified on the articulatory and acoustic features, manifested by reduced modulation depths of most articulators and critical-spectral-band envelopes, primarily at the timescales of syllables and onset-rime/phonemes. For healthy speakers, the strongest articulatory-acoustic relationships were found for (a) jaw and lip, in modulating stress timing, and (b) tongue tip, in modulating the timing relation between onset-rime/phonemes and syllables. For speakers with ALS, the tongue body, tongue tip, and jaw all showed the greatest contributions to modulating syllable timing. CONCLUSIONS: The observed disease impacts likely reflect reduced entrainment of speech motor activities to finer-grained linguistic events, presumably due to the dynamic constraints of the neuromuscular system. To accommodate these restrictions, speakers with ALS appear to use their residual articulatory motor capacities to accentuate and convey the perceptually most salient temporal cues underpinned by the syllable-centric parsing mechanism. This adaptive strategy has potential implications in managing neuromotor speech disorders.

Assuntos

Esclerose Lateral Amiotrófica , Fala , Humanos , Inteligibilidade da Fala , Esclerose Lateral Amiotrófica/complicações , Arcada Osseodentária , Distúrbios da Fala , Medida da Produção da Fala , Língua , Acústica da Fala

3.

EEG-based assessment of temporal fine structure and envelope effect in mandarin syllable and tone perception.

Ni, Guangjian; Xu, Zihao; Bai, Yanru; Zheng, Qi; Zhao, Ran; Wu, Yubo; Ming, Dong.

Cereb Cortex ; 33(23): 11287-11299, 2023 11 27.

Artigo em Inglês | MEDLINE | ID: mdl-37804238

RESUMO

In recent years, speech perception research has benefited from low-frequency rhythm entrainment tracking of the speech envelope. However, speech perception is still controversial regarding the role of speech envelope and temporal fine structure, especially in Mandarin. This study aimed to discuss the dependence of Mandarin syllables and tones perception on the speech envelope and the temporal fine structure. We recorded the electroencephalogram (EEG) of the subjects under three acoustic conditions using the sound chimerism analysis, including (i) the original speech, (ii) the speech envelope and the sinusoidal modulation, and (iii) the fine structure of time and the modulation of the non-speech (white noise) sound envelope. We found that syllable perception mainly depended on the speech envelope, while tone perception depended on the temporal fine structure. The delta bands were prominent, and the parietal and prefrontal lobes were the main activated brain areas, regardless of whether syllable or tone perception was involved. Finally, we decoded the spatiotemporal features of Mandarin perception from the microstate sequence. The spatiotemporal feature sequence of the EEG caused by speech material was found to be specific, suggesting a new perspective for the subsequent auditory brain-computer interface. These results provided a new scheme for the coding strategy of new hearing aids for native Mandarin speakers.

Assuntos

Percepção da Fala , Humanos , Ruído , Percepção do Timbre , Acústica da Fala , Eletroencefalografia , Estimulação Acústica

4.

Effects of different medical masks on acoustic and aerodynamic voice assessment during the COVID-19 pandemic.

Yu, Mingwen; Jin, Qianqian; Zhang, Weiming; Sun, Xin; Sun, Yuxin; Xie, Qing.

Medicine (Baltimore) ; 102(31): e34470, 2023 Aug 04.

Artigo em Inglês | MEDLINE | ID: mdl-37543813

RESUMO

The purpose of the study was to investigate the effect of the surgical masks and N95 masks on the acoustic and aerodynamic parameters of voice assessment during the coronavirus disease 2019 pandemic. The challenge of the study was to enable each inexperienced participant to perform a number of acoustic and aerodynamic voice assessment in a qualified and homogeneous manner without and with medical masks, and to minimize the individual differences. There were 32 healthy participants recruited in the study, including 16 males and 16 females. The acoustic parameters analyzed included fundamental frequency, standard deviation of fundamental frequency (fundamental frequency standard deviation), percentage of jitter (%), percentage of shimmer (%), glottal-to-noise excitation ratio (GNE), and the parameters of irregularity, noise and overall severity. The aerodynamic parameters included s time, z time, s/z ratio and maximum phonation time. When wearing surgical masks, the GNE ratio (P = .043) significantly increased, whereas noise (P = .039) and s time (P = .018) significantly decreased. When wearing N95 masks, the percentage of shimmer (P = .049), s time (P = .037) and s/z ratio (P = .048) significantly decrease. In general, performing voice assessment with a medical mask proved to be reliable for most of the acoustic and aerodynamic parameters. It is worth noting that the shimmer (%), could be slightly impacted when wearing N95 masks. Wearing surgical masks might slightly influence the measurement of noise and higher GNE ratio. The s/z ratio could be affected when wearing N95 masks. The contribution of the study is to explore acoustic and aerodynamic parameters that might be easily affected by wearing masks during the voice assessment, and provide references for clinical evaluation of voice disorders during the pandemic of coronavirus disease 2019.

Assuntos

COVID-19 , Pandemias , Masculino , Feminino , Humanos , Pandemias/prevenção & controle , Qualidade da Voz , Acústica da Fala , COVID-19/prevenção & controle , Máscaras , Acústica

5.

Are smartphones and low-cost external microphones comparable for measuring time-domain acoustic parameters?

Ceylan, M Enes; Cangi, M Emrah; Yilmaz, Göksu; Peru, Beyza Sena; Yigit, Özgür.

Eur Arch Otorhinolaryngol ; 280(12): 5433-5444, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-37584753

RESUMO

PURPOSE: This study examined and compared the diagnostic accuracy and correlation levels of the acoustic parameters of the audio recordings obtained from smartphones on two operating systems and from dynamic and condenser types of external microphones. METHOD: The study included 87 adults: 57 with voice disorder and 30 with a healthy voice. Each participant was asked to perform a sustained vowel phonation (/a/). The recordings were taken simultaneously using five microphones AKG-P220, Shure-SM58, Samson Go Mic, Apple iPhone 6, and Samsung Galaxy J7 Pro microphones in an acoustically insulated cabinet. Acoustic examinations were performed using Praat version 6.2.09. The data were examined using Pearson correlation and receiver-operating characteristic (ROC) analyses. RESULTS: The parameters with the highest area under curve (AUC) values among all microphone recordings in the time-domain analyses were the frequency perturbation parameters. Additionally, considering the correlation coefficients obtained by synchronizing the microphones with each other and the AUC values together, the parameter with the highest correlation coefficient and diagnostic accuracy values was the jitter-local parameter. CONCLUSION: Period-to-period perturbation parameters obtained from audio recordings made with smartphones show similar levels of diagnostic accuracy to external microphones used in clinical conditions.

Assuntos

Smartphone , Acústica da Fala , Adulto , Humanos , Qualidade da Voz , Reprodutibilidade dos Testes , Acústica , Medida da Produção da Fala

6.

An Assessment of Different Praat Versions for Acoustic Measures Analyzed Automatically by VoiceEvalU8 and Manually by Two Raters.

Grillo, Elizabeth U; Wolfberg, Jeremy.

J Voice ; 37(1): 17-25, 2023 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-33384248

RESUMO

INTRODUCTION: The purpose of the study was to assess acoustic measures of fundamental frequency (fo), standard deviation of fo (SD of fo), jitter%, shimmer%, noise-to-harmonic ratio (NHR), smoothed cepstral peak prominence (CPPS), and acoustic voice quality index analyzed through multiple Praat versions automatically by VoiceEvalU8 or manually by two raters. In addition, default settings to calculate CPPS in two Praat versions manually analyzed by two raters were compared to Maryn and Weenik20 procedures for CPPS automatically analyzed by VoiceEvalU8. METHODS: Nineteen vocally healthy females used VoiceEvalU8 to record three 5-s sustained /a/ trials, the all voiced phrase "we were away a year ago," and a 15-s speech sample twice a day for five consecutive days. Two raters manually completed acoustic analysis using different versions of Praat and compared that analysis to measures automatically generated through a version of Praat used by VoiceEvalU8. One-way analyses of variance were run for all acoustic measures with post-hoc testing by the Bonferroni method. For acoustic measures that demonstrated significant differences, intraclass correlation coefficients were conducted. RESULTS: Results showed no significant differences across automatic and manual analysis for different versions of Praat for all acoustic measures during /a/, for fo, jitter%, shimmer%, and NHR during the phrase, for jitter%, shimmer%, NHR, and CPPS during speech, and for acoustic voice quality index calculated from both sustained /a/ and the phrase. The default Praat settings for CPPS were not significantly different from the Maryn and Weenik20 procedures for sustained /a/ and speech. Significant differences were present for SD of fo and CPPS during the phrase and fo and SD of fo during speech. SD of fo and CPPS in the phrase were moderately correlated and fo and SD of fo during speech demonstrated good to excellent correlations across the different versions of Praat. CONCLUSIONS: Acoustic measures analyzed through sustained /a/ and some of the acoustic measures during the phrase and speech were not different across multiple versions of Praat. Automatic analysis by VoiceEvalU8 produced similar mean values as compared to manual analysis by two raters. Even though SD of fo and CPPS in the phrase and fo and SD of fo in speech were different across the versions of Praat, the measures demonstrated moderate to excellent reliability.

Assuntos

Acústica da Fala , Voz , Feminino , Humanos , Reprodutibilidade dos Testes , Acústica , Qualidade da Voz , Medida da Produção da Fala/métodos

7.

Towards the Objective Speech Assessment of Smoking Status based on Voice Features: A Review of the Literature.

Ma, Zhizhong; Bullen, Chris; Chu, Joanna Ting Wai; Wang, Ruili; Wang, Yingchun; Singh, Satwinder.

J Voice ; 37(2): 300.e11-300.e20, 2023 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-33495036

RESUMO

BACKGROUND AND OBJECTIVE: In smoking cessation clinical research and practice, objective validation of self-reported smoking status is crucial for ensuring the reliability of the primary outcome, that is, smoking abstinence. Speech signals convey important information about a speaker, such as age, gender, body size, emotional state, and health state. We investigated (1) if smoking could measurably alter voice features, (2) if smoking cessation could lead to changes in voice, and therefore (3) if the voice-based smoking status assessment has the potential to be used as an objective smoking cessation validation method. METHODS: A systematic review of the scientific literature was conducted to compile studies on smoking status assessment based on voice features. We searched nine scientific databases for original studies involving the effects of smoking on voice features, the effects of smoking cessation on voice features. RESULTS: A total of 34 studies were identified for review. We found that fundamental frequency, jitter, shimmer, harmonics to noise ratio, and other voice features are affected by smoking and could be used to assess smoking status. CONCLUSION: Speech assessment of smoking status based on voice features has potential as a smoking status validation method, as it is simple, reliable, and less time-consuming. Furthermore, this study provides recommendations for future research on the objective speech assessment of smoking status based on voice features.

Assuntos

Fala , Qualidade da Voz , Humanos , Fumar , Reprodutibilidade dos Testes , Acústica da Fala

8.

Effects of Medical Masks on Voice Assessment During the COVID-19 Pandemic.

Lin, Yuhong; Cheng, Liyu; Wang, Qingcui; Xu, Wen.

J Voice ; 37(5): 802.e25-802.e29, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-34116888

RESUMO

OBJECTIVE: Voice assessment is of great significance to the evaluation of voice quality. Our study aims to explore the effects of medical masks on healthy people in acoustic, aerodynamic and formant parameters during the COVID-19 pandemic. In addition, we also attempted to verify the differences between different sexes and ages. METHODS: Fifty-three healthy participants (25 males and 28 females) were involved in our study. The acoustic parameters, including fundamental frequency (F0), sound pressure level (SPL), percentage of jitter (%), percentage of shimmer (%), noise to harmonic ratio (NHR) and cepstral peak prominence (CPP), aerodynamic parameter (maximum phonation time, MPT) and formant parameters (formant frequency, F1, F2, F3) without and with wearing medical masks were included. We further investigated the potential differences in the impact on different sexes and ages (≤45 years old and >45 years old). RESULTS: While wearing medical masks, the SPL significantly increased (71.22±4.25 dB, 72.42±3.96 dB, P = 0.021). Jitter and shimmer significantly decreased (jitter 1.19±0.83, 0.87±0.67 P = 0.005; shimmer 4.49±2.20, 3.66±2.02 P = 0.002), as did F3 (2855±323.34 Hz, 2781.89±353.42 Hz P = 0.004). F0, MPT, F1 and F2 showed increasing trends without statistical significance, and NHR as well as CPP showed little change without and with wearing medical masks. There were no significant differences seen between males and females. Regarding to age, a significant difference in MPT was seen (>45-year-old 16.15±6.98 s, 15.38±7.02 s; ≤45-year-old 20.26±6.47 s, 21.44±6.98 s, P = 0.032). CONCLUSION: Healthy participants showed a significantly higher SPL, a smaller perturbation and an evident decrease in F3 after wearing medical masks. These changes may result from the adjustment of the vocal tract and the filtration function of medical masks, leading to the stability of voices we recorded being overstated. The impacts of medical masks on sex were not evident, while the MPT in the >45-year-old group was influenced more than that in the ≤45-year-old group.

Assuntos

COVID-19 , Voz , Masculino , Feminino , Humanos , Pessoa de Meia-Idade , Fonação , Acústica da Fala , Máscaras/efeitos adversos , Pandemias/prevenção & controle , COVID-19/prevenção & controle

9.

Production and Perception Evidence of a Merger: [l] and [n] in Fuzhou Min.

Cheng, Ruoqian; Jongman, Allard; Sereno, Joan A.

Lang Speech ; 66(3): 533-563, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-36000389

RESUMO

The current study investigated the merger-in-progress between word-initial nasal and lateral consonants in Fuzhou Min, examining the linguistic and social factors that modulate the merger. First, the acoustic cues to the l-n distinction were examined in Fuzhou Min. Acoustic analyses suggested a collapse of phonemic contrast between prescriptive L and N (phonemes in the unmerged system), with none of the six acoustic cues showing any difference across L and N. Linear discriminant analysis did identify acoustically distinct [l] and [n] tokens, although the mapping onto the phonetic space of prescriptive L and N substantially overlapped. Speakers of all ages and both genders tended to produce [l], and low vowels correlated with more [n]-like classification. In perception, AX discrimination data showed Fuzhou Min listeners confused both prescriptive L and N and acoustic [l] and [n]. Greater sensitivity to the acoustic differences occurred in the context of low vowels and a nasal coda, supported by the acoustics of the stimuli, and younger listeners were more sensitive to the difference between [l] and [n] than older listeners. In two-alternative forced choice (2AFC) identification, Fuzhou Min listeners also identified the merged form as L more frequently than N, with more L responses elicited in the context of low vowels and in the absence of nasal codas. Overall, although Fuzhou Min speakers produced some acoustically distinct [l] and [n] tokens in the context of a sound merger, these productions did not map onto prescriptive L and N. In addition, younger listeners were more sensitive to the acoustic distinction than older listeners, suggesting an emerging acoustic contrast possibly arising due to contact with Mandarin.

Assuntos

Percepção da Fala , Humanos , Masculino , Feminino , Percepção da Fala/fisiologia , Fonética , Acústica , Acústica da Fala , Sinais (Psicologia)

10.

Parâmetros e tipos de avaliação da disartria na esclerose lateral amiotrófica / Parameters and types of dysarthria assessment in amyotrophic lateral sclerosis

Rodrigues, Luzimara Gláucia Oliveira; Lima, Ivonaldo Leidson Barbosa; Dourado Júnior, Mário Emílio Teixeira; Gonçalves, Maria de Jesus.

Audiol., Commun. res ; 28: e2791, 2023. tab, graf

Artigo em Português | LILACS | ID: biblio-1520263

RESUMO

RESUMO Objetivo identificar estudos a respeito dos parâmetros e dos tipos de avaliação utilizados para avaliar a disartria na esclerose lateral amiotrófica (ELA). Estratégia de pesquisa estudo de revisão integrativa da literatura realizada nas bases de dados LILACS, SciELO, PubMed, Web of Science, CINAHL, Scopus e Cochrane, por meios dos descritores, em português e em inglês, "Avaliação AND Disartria AND Esclerose Lateral Amiotrófica". Critérios de seleção os critérios de inclusão foram: artigos que abordavam estudos sobre avaliação da disartria na ELA, nas línguas inglesa, espanhola e portuguesa, disponíveis na íntegra, no período de 2015 a 2022. Resultados do total de 38 estudos, apenas 3 usaram um único tipo de avaliação da disartria. A maior parte dos estudos utilizou mais de um tipo de avaliação variando de 2 a 4. Foram 3 os tipos de avaliação mais utilizados, com o intuito de avaliar o grau de inteligibilidade de fala: avaliação perceptivo-auditiva (31 estudos), avaliação acústica (18 estudos) e avaliação do movimento (27 estudos). Conclusão a avaliação da disartria na ELA é realizada por diferentes procedimentos e com vários parâmetros de análise, em especial pela avaliação perceptivo-auditiva e do movimento.

ABSTRACT Purpose to identify studies regarding the parameters and types of assessment used to evaluate dysarthria in amyotrophic lateral sclerosis (ALS). Research strategy an integrative literature review study was conducted on the LILACS, SciELO, PubMed, Web of Science, CINAHL, Scopus, and Cochrane databases using the descriptors "Assessment AND Dysarthria AND Amyotrophic Lateral Sclerosis" in both Portuguese and English. Selection criteria the inclusion criteria consisted of articles that addressed studies on dysarthria assessment in ALS, written in English, Spanish, and Portuguese, which should be available in full, and published from 2015 to 2022. Results: out of the total of 38 studies, only 3 used a single type of dysarthria assessment. Most studies employed more than one type of assessment, ranging from 2 to 4 types. Three assessment types were predominantly used to assess the degree of speech intelligibility: auditoryperceptual assessment (31 studies), acoustic assessment (18 studies), and movement assessment (27 studies). Conclusion dysarthria assessment in ALS is conducted through various procedures and with multiple analysis parameters, notably through auditory-perceptual and movement assessments.

Assuntos

Humanos , Masculino , Feminino , Percepção Auditiva , Acústica da Fala , Inteligibilidade da Fala , Medida da Produção da Fala , Diagnóstico Precoce , Disartria , Esclerose Lateral Amiotrófica/diagnóstico

11.

Reducing the GAP between science and clinic: lessons from academia and professional practice - part A: perceptual-auditory judgment of vocal quality, acoustic vocal signal analysis and voice self-assessment. / Reduzindo o GAP entre a ciência e a clínica: lições da academia e da prática profissional parte A: julgamento perceptivo-auditivo da qualidade vocal, análise acústica do sinal vocal e autoavaliação em voz.

Behlau, Mara; Almeida, Anna Alice; Amorim, Geová; Balata, Patrícia; Bastos, Sávio; Cassol, Mauricéia; Constantini, Ana Carolina; Eckley, Claudia; Englert, Marina; Gama, Ana Cristina Cortes; Gielow, Ingrid; Guimarães, Bruno; Lima, Livia Ribeiro; Lopes, Leonardo; Madazio, Glaucya; Moreti, Felipe; Mouffron, Vanessa; Nemr, Katia; Oliveira, Priscila; Padovani, Marina; Ribeiro, Vanessa Veis; Silverio, Kelly; Vaiano, Thays; Yamasaki, Rosiane.

Codas ; 34(5): e20210240, 2022.

Artigo em Português, Inglês | MEDLINE | ID: mdl-35920467

RESUMO

During the XXVIII Brazilian Congress of SBFa, 24 specialists met and, from a leading position on scientific research as a tool for connecting laboratory and clinic, five fronts of knowledge of the voice specialty were discussed as following: Perceptual-auditory judgment of vocal quality; 2. Acoustic analysis of the vocal signal; 3. Voice self-assessment; 4. Traditional techniques of therapy; 5. Modern techniques of electrostimulation and photobiomodulation (PBMT) in voice. Part "a" of this publication was associated with the consolidation of the analyses of the first three aspects. The trend in the perceptual-auditory judgment of vocal quality was related to the use of standard protocols. The acoustic evaluation of the vocal signal is accessible and can be done descriptively or by extraction of parameters, thus preferring multiparametric measures. Finally, the analysis of the individual himself closes this triad of voice documentation, which will be the basis for the conclusion of the evaluation, reference for monitoring progress, and evaluation of treatment results.

No XXVIII Congresso Brasileiro da SBFa, 24 especialistas reuniram-se e, a partir de um posicionamento condutor sobre pesquisa científica como ferramenta de conexão entre laboratório e clínica, cinco frentes de conhecimento da especialidade de voz foram discutidas: 1. Julgamento perceptivo-auditivo da qualidade vocal; 2. Análise acústica do sinal vocal; 3. Autoavaliação em voz; 4. Técnicas tradicionais de terapia; 5. Técnicas modernas de eletroestimulação e fotobiomodulação em voz. A parte "a" desta publicação é a consolidação das análises dos três primeiros aspectos. A tendência no julgamento perceptivo-auditivo da qualidade vocal é o uso de protocolos padrão. A avaliação acústica do sinal vocal é acessível e pode ser feita de modo descritivo ou por extração de parâmetros, preferindo-se medidas multiparamétricas. Finalmente, a análise do próprio indivíduo fecha essa tríade de documentação fonoaudiológica, que será base para a conclusão da avaliação, referência para monitoramento do progresso e avaliação de resultado de tratamento.

Assuntos

Julgamento , Autoavaliação (Psicologia) , Acústica , Humanos , Prática Profissional , Acústica da Fala , Qualidade da Voz/fisiologia

12.

The Effect of Microphone Frequency Response on Spectral and Cepstral Measures of Voice: An Examination of Low-Cost Electret Headset Microphones.

Awan, Shaheen N; Shaikh, Mohsin A; Desjardins, Maude; Feinstein, Hagar; Abbott, Katherine Verdolini.

Am J Speech Lang Pathol ; 31(2): 959-973, 2022 03 10.

Artigo em Inglês | MEDLINE | ID: mdl-35050724

RESUMO

PURPOSE: The purpose of this study was to establish the frequency response of a selection of low-cost headset microphones that could be given to subjects for remote voice recordings and to examine the effect of microphone type and frequency response on key acoustic measures related to voice quality obtained from speech and vowel samples. METHOD: The frequency responses of three low-cost headset microphones were evaluated using pink noise generated via a head-and-torso model. Each of the headset microphones was then used to record a series of speech and vowel samples prerecorded from 24 speakers who represented a diversity of sex, age, fundamental frequency (F o), and voice quality types. Recordings were later analyzed for the following measures: smoothed cepstral peak prominence (CPP; dB), low versus high spectral ratio (L/H ratio; dB), CPP F o (Hz), and cepstral spectral index of dysphonia (CSID). RESULTS: The frequency response of the microphones under test was observed to have nonsignificant effects on measures of the CPP and CPP F o, significant effects on the CSID in speech contexts, and strong and significant effects on the measure of spectral tilt (L/H ratio). However, the correlations between the various headset microphones and a reference precision microphone were excellent (rs > .90). CONCLUSIONS: The headset microphones under test all showed the capability to track a wide range of diversity in the voice signal. Though the use of higher quality microphones that have demonstrated specifications is recommended for typical research and clinical purposes, low-cost electret microphones may be used to provide valid measures of voice, specifically when the same microphone and signal chain is used for the evaluation of pre- versus posttreatment change or intergroup comparisons.

Assuntos

Disfonia , Voz , Disfonia/diagnóstico , Humanos , Acústica da Fala , Medida da Produção da Fala , Voz/fisiologia , Qualidade da Voz

13.

Automated Assessment of Glottal Dysfunction Through Unified Acoustic Voice Analysis.

McLoughlin, Ian Vince; Perrotin, Olivier; Sharifzadeh, Hamid; Allen, Jacqui; Song, Yan.

J Voice ; 36(6): 743-754, 2022 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-32980231

RESUMO

This paper uses the recent glottal flow model for iterative adaptive inverse filtering to analyze recordings from dysfunctional speakers, namely those with larynx-related impairment such as laryngectomy. The analytical model allows extraction of the voice source spectrum, described by a compact set of parameters. This single model is used to visualize and better understand speech production characteristics across impaired and nonimpaired voices. The analysis reveals some discriminative aspects of the source model which map to a physiological class description of those impairments. Furthermore, being based on analysis of source parameters only, it is complementary to any existing techniques of vocal-tract or phonetic analysis. The results indicate the potential for future automated speech reconstruction systems that adapt to the method of reconstruction required, as well as being useful for mainstream speech systems, such as ASR, in which front-end analysis can direct back-end models to suit characteristics of impaired speech.

Assuntos

Qualidade da Voz , Voz , Humanos , Acústica da Fala , Glote/cirurgia , Glote/fisiologia , Voz/fisiologia , Acústica

14.

Multiparameter Voice Assessment in Dysphonics: Correlation Between Objective and Perceptual Parameters.

Narasimhan, S V; Rashmi, Rajesh.

J Voice ; 36(3): 335-343, 2022 May.

Artigo em Inglês | MEDLINE | ID: mdl-32651100

RESUMO

BACKGROUND: Perceptual assessment and objective measures of voice provide a quantifiable tool for determining the degree of glottal closure, thus helping to distinguish dysphonic voices from normal voices. The correlation between the perceptual and objective parameters of voice in dysphonic can enable the voice pathologist to be more effective in differentiating the normal voices from dysphonic voices. However, only a few studies have investigated the correlation between these measures. OBJECTIVE: To document the differences in the perceptual and objective parameters of voice in participants with dysphonia and normal controls and to investigate the correlation between the perceptual and objective parameters of voice among participants with dysphonia. STUDY DESIGN: This investigation deployed standard group comparison and a retrospective study. METHODS: Two groups of participants were included in the study. Participants in group 1 were diagnosed as having a voice disorder secondary to organic pathologies and group 2 participants had a clinically normal voice. Phonation samples of all the participants were collected and perceptual analysis was carried out using the GRBAS rating scale. As part of the objective measures, acoustic and cepstral measures were extracted from the phonation samples. RESULTS: The analysis of the results revealed significant differences in perceptual ratings between the normal (control) and dysphonic groups. The mean values of all the objective measures of voice presented significant differences between participants of both groups. The perceptual ratings of grade, breathiness, and roughness showed better correlations with the cepstral measures than with the time-based acoustic measures. CONCLUSIONS: Further foraging research on the correlation between perceptual and objective measures of voice in various degrees of dysphonia will improve reliability while discriminating and quantifying hoarse, harsh and breathy voices from modal voices.

Assuntos

Disfonia , Disfonia/diagnóstico , Rouquidão , Humanos , Reprodutibilidade dos Testes , Estudos Retrospectivos , Acústica da Fala , Medida da Produção da Fala/métodos , Qualidade da Voz

15.

The cepstral spectral index of dysphonia, the acoustic voice quality index and the acoustic breathiness index as novel multiparametric indices for acoustic assessment of voice quality.

Barsties V Latoszek, Ben; Mathmann, Philipp; Neumann, Katrin.

Curr Opin Otolaryngol Head Neck Surg ; 29(6): 451-457, 2021 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-34334615

RESUMO

PURPOSE OF REVIEW: The objective assessment of voice quality using acoustic measures is an important pillar of voice diagnostics. This article reviews three recent acoustic measures and their clinical use in phoniatrics and laryngology. RECENT FINDINGS: Two acoustic parameters, the cepstral spectral index of dysphonia (CSID) and the acoustic voice quality index (AVQI), have gained importance as validated multiparametric indices in the objective assessment of hoarseness because they include both continuous speech and sustained vowels. The acoustic breathiness index (ABI), another multiparametric index, assesses breathiness admixture during phonation and identifies it robustly, unaffected by other characteristics of dysphonia such as roughness. SUMMARY: Acoustic measurements are useful diagnostic tools when used correctly with an appropriate recording system, consideration of environment and use of software programs. CSID, AVQI and ABI objectively improve the detection of voice quality abnormalities. In addition to their proven validity, their application is simple and their usability for clinicians is high.

Assuntos

Disfonia , Acústica , Disfonia/diagnóstico , Rouquidão , Humanos , Reprodutibilidade dos Testes , Índice de Gravidade de Doença , Acústica da Fala , Medida da Produção da Fala , Qualidade da Voz

16.

[An assessment of acoustic properties of a large-capacity open-plan office room according to a 3-level rating scale - a case study]. / Ocena w skali trzystopniowej wlasciwosci akustycznych biurowego pomieszczenia open space o duzej kubaturze opis przypadku.

Mikulski, Witold.

Med Pr ; 72(4): 375-390, 2021 Aug 31.

Artigo em Polonês | MEDLINE | ID: mdl-34328464

RESUMO

BACKGROUND: In open-plan office rooms, one of the main reasons for the nuisance of work is the noise from employees' conversations. In order to limit it, the permissible values of the parameters characterizing the acoustic properties in those rooms on a 2-level scale are defined. MATERIAL AND METHODS: The article introduces a 3-level scale for assessing the acoustic properties (bad, fair, good) of a room based on EN ISO 33823:2012 and PN-B-02151-4:2015 - criterion 1. Additionally and alternatively, a new 3-level scale assessment criterion (criterion 2), concerning acoustic separation between groups of workstations, was determined. In order to meet that criterion, it is necessary to take into account the acoustic treatment of the room. A multivariate (7) acoustic treatment studies were performed using computational simulation methods. RESULTS: Requirements, according to PN-B-02151-4:2015, were met after the application of a sound-absorbing suspended ceiling and acoustic screens at workplaces. To meet the requirements of EN ISO 33823:2012, it was necessary to additionally use sound-absorbing materials on the walls and acoustic screens separating the naves of the room. To meet the requirements of criterion 2, it was necessary to additionally use acoustic screens separating groups of workstations and acoustic screens in passages. CONCLUSIONS: Appropriate acoustic properties can be obtained in open space offices. Appropriate acoustic properties can be obtained in open-space offices. The requirements according to PN-B-02151-4:2015 can be met with much lower acoustic treatment than the requirements according to EN ISO 33823:2012. The use of a 3-level scale for assessing the acoustic properties of a room allows for the differentiation of rooms with regard to their acoustic properties. The introduction of a new assessment method, taking into account the grouping of workplaces in a room, makes it possible to assess the acoustic properties of a room in a more reliable way, by neglecting the impact on the assessment of areas where people are not present. Med Pr. 2021;72(4):375-90.

Assuntos

Ruído Ocupacional , Acústica , Humanos , Som , Acústica da Fala , Local de Trabalho

17.

Tracking the Costs of Clear and Loud Speech: Interactions Between Speech Motor Control and Concurrent Visuomotor Tracking.

Whitfield, Jason A; Holdosh, Serena R; Kriegel, Zoe; Sullivan, Lauren E; Fullenkamp, Adam M.

J Speech Lang Hear Res ; 64(6S): 2182-2195, 2021 06 18.

Artigo em Inglês | MEDLINE | ID: mdl-33719529

RESUMO

Purpose Prior work has demonstrated that competing tasks impact habitual speech production. The purpose of this investigation was to quantify the extent to which clear and loud speech are affected by concurrent performance of an attention-demanding task. Method Speech kinematics and acoustics were collected while participants spoke using habitual, loud, and clear speech styles. The styles were performed in isolation and while performing a secondary tracking task. Results Compared to the habitual style, speakers exhibited expected increases in lip aperture range of motion and speech intensity for the clear and loud styles. During concurrent visuomotor tracking, there was a decrease in lip aperture range of motion and speech intensity for the habitual style. Tracking performance during habitual speech did not differ from single-task tracking. For loud and clear speech, speakers retained the gains in speech intensity and range of motion, respectively, while concurrently tracking. A reduction in tracking performance was observed during concurrent loud and clear speech, compared to tracking alone. Conclusions These data suggest that loud and clear speech may help to mitigate motor interference associated with concurrent performance of an attention-demanding task. Additionally, reductions in tracking accuracy observed during concurrent loud and clear speech may suggest that these higher effort speaking styles require greater attentional resources than habitual speech.

Assuntos

Acústica da Fala , Fala , Acústica , Disartria , Humanos , Medida da Produção da Fala

18.

Age and Sex Comparison of Aerodynamic Phonation Measurements Using Noninvasive Assessment.

Lamb, Jim R; Scholp, Austin J; Jiang, Jack J.

J Speech Lang Hear Res ; 64(3): 776-791, 2021 03 17.

Artigo em Inglês | MEDLINE | ID: mdl-33606949

RESUMO

Purpose The goal of this study was to present vocal aerodynamic measurements from pediatric and adult participant pools. There are a number of anatomical changes involving the larynx and vocal folds that occur as children age and become adults. Data were collected using two methods of noninvasive aerodynamic assessment: mechanical interruption and labial interruption. Method A total of 154 participants aged 4-24 years old took part in this study. Ten trials were performed for both methods of airway interruption. To perform mechanical interruption, participants phonated /α/ for 10 s trials while a balloon valve interrupted phonation 5 times. For labial interruption, participants said /pα/ 5 times at comfortable and quiet volumes. Aerodynamic measures included subglottal pressure, phonation threshold pressure, mean airflow, laryngeal resistance, and others. Results One hundred one participants (51 females) successfully completed testing with both methods. Eight out of 20 measurements were found to have a statistically significant effect of participant age on measurements. Sex alone had a significant effect on vocal efficiency for the labial quiet method. Conclusions The data discussed here can be used to view age and sex trends in vocal aerodynamic measurements. When using either method of mechanical or labial interruption, participant age needs to be taken into account to properly interpret several aerodynamic parameters. A participant's sex is not as important when using these methods.

Assuntos

Laringe , Acústica da Fala , Adolescente , Adulto , Criança , Pré-Escolar , Feminino , Humanos , Fonação , Pressão , Prega Vocal , Adulto Jovem

19.

Assessment of dysphonia: cepstral analysis versus conventional acoustic analysis.

Hassan, Elham Moamen; Abdel Hady, Aisha Fawzy; Shohdi, Sahar Saad; Eldessouky, Hossam Mohammed; Din, Mohammed Hussein Badrel.

Logoped Phoniatr Vocol ; 46(3): 99-109, 2021 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-32436465

RESUMO

OBJECTIVE: In this study, we aimed to determine the extent to which smoothed cepstral peak prominence (CPPS) can replace or complement the conventional acoustic measures of jitter, shimmer, and harmonic-to-noise ratio in the assessment of various types of dysphonia. METHODOLOGY: A total of 60 males and 80 females were divided into two groups: dysphonic group and control group (30 males and 40 females in each group). The voice samples in the form of sustained vowel /a/ phonation and continuous speech were recorded and assessed using auditory perceptual analysis, acoustic analysis, and cepstral analysis. RESULTS: Jitter was found to have the best predictive ability during sustained phonation, whereas CPPS was found to have the best predictive ability during continuous speech. CONCLUSION: Cepstral analysis is as reliable as the conventional acoustic analysis in the diagnosis of dysphonia and to detect its severity. However, CPPS cannot replace conventional acoustic measures.

Assuntos

Disfonia , Acústica , Disfonia/diagnóstico , Feminino , Humanos , Masculino , Fonação , Acústica da Fala , Medida da Produção da Fala , Qualidade da Voz

20.

Articulatory Evidence for the Syllable-final Nasal Merging in Taiwan Mandarin.

Chiu, Chenhao; Lu, Yu-An.

Lang Speech ; 64(4): 771-789, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-33300459

RESUMO

Syllable-final nasals /n/ and /Å/ in Taiwan Mandarin have been reported to be undergoing merging. Perceptual studies have reported that the alleged merging is context-sensitive and the merging directions are vowel-dependent. These findings have been mostly attributed to dialectal and social factors. The current study uses ultrasonography to capture postures of the entire tongue during the production of syllable-final nasals. The results, though confirming previous findings that the merging directions of syllable-final nasals are vowel-dependent, are best accounted for by the biomechanics of the tongue, as supported by computational 3D model simulations. Furthermore, for some speakers, although nasals were merged in terms of tongue posture, the degrees of nasalization of the preceding vowel were contrastive, suggesting that the merging process may be incomplete.

Assuntos

Idioma , Fonética , Humanos , Acústica da Fala , Taiwan , Língua/diagnóstico por imagem

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA