Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 8.779
Filtrar
1.
Alzheimers Res Ther ; 16(1): 176, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39090738

RESUMO

BACKGROUND: Digital speech assessment has potential relevance in the earliest, preclinical stages of Alzheimer's disease (AD). We evaluated the feasibility, test-retest reliability, and association with AD-related amyloid-beta (Aß) pathology of speech acoustics measured over multiple assessments in a remote setting. METHODS: Fifty cognitively unimpaired adults (Age 68 ± 6.2 years, 58% female, 46% Aß-positive) completed remote, tablet-based speech assessments (i.e., picture description, journal-prompt storytelling, verbal fluency tasks) for five days. The testing paradigm was repeated after 2-3 weeks. Acoustic speech features were automatically extracted from the voice recordings, and mean scores were calculated over the 5-day period. We assessed feasibility by adherence rates and usability ratings on the System Usability Scale (SUS) questionnaire. Test-retest reliability was examined with intraclass correlation coefficients (ICCs). We investigated the associations between acoustic features and Aß-pathology, using linear regression models, adjusted for age, sex and education. RESULTS: The speech assessment was feasible, indicated by 91.6% adherence and usability scores of 86.0 ± 9.9. High reliability (ICC ≥ 0.75) was found across averaged speech samples. Aß-positive individuals displayed a higher pause-to-word ratio in picture description (B = -0.05, p = 0.040) and journal-prompt storytelling (B = -0.07, p = 0.032) than Aß-negative individuals, although this effect lost significance after correction for multiple testing. CONCLUSION: Our findings support the feasibility and reliability of multi-day remote assessment of speech acoustics in cognitively unimpaired individuals with and without Aß-pathology, which lays the foundation for the use of speech biomarkers in the context of early AD.


Assuntos
Estudos de Viabilidade , Acústica da Fala , Humanos , Feminino , Masculino , Idoso , Reprodutibilidade dos Testes , Pessoa de Meia-Idade , Doença de Alzheimer/diagnóstico , Peptídeos beta-Amiloides , Fala/fisiologia
2.
J Acoust Soc Am ; 156(1): 655-671, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39051719

RESUMO

The Kam language has experienced historical tonal splits, resulting in the development of a complex tonal system. However, there is still limited knowledge regarding the acoustic characteristics associated with aspiration-based tone splitting. This study aims to investigate the acoustic cues related to the tonal registers and laryngeal configurations in Donglei Kam, a dialect of Southern Kam. Sixteen native speakers of Donglei Kam participated, producing lexical tones. Statistical analyses were conducted to examine the acoustic distinctions between tonal registers, using measurements of voice onset time, spectral tilt, noise, and energy. The results indicated that Donglei Kam retained a two-way contrast of aspiration, albeit with a trend toward gradual loss. Additionally, a breathy voice was detected in the Ciyin tonal register, characterized by elevated spectral tilt values and spectral noise throughout the vowels. Moreover, machine learning classifiers effectively identified tonal registers using voice-quality data, suggesting that the phonation contrast between breathy and modal voice could contribute to the tonal split alongside pitch contrast. In summary, these findings enhance our understanding of the acoustic implementation of breathiness in Kam and offer valuable insights into the role of laryngeal contrast in tonal splits.


Assuntos
Sinais (Psicologia) , Fonação , Acústica da Fala , Qualidade da Voz , Humanos , Masculino , Feminino , Adulto , Adulto Jovem , Idioma , Laringe/fisiologia , Acústica , Fonética
3.
JASA Express Lett ; 4(7)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39017042

RESUMO

Using visual spectrographic examination of vowel nasalization to diagnose the syllabic affiliation of phonologically ambisyllabic nasal consonants (e.g., gamma), Durvasula and Huang [(2017). Lang. Sci. 62, 17-36] argued that anticipatory vowel nasalization in these words patterns with word-medial codas. Using nasometry, the current study finds that anticipatory nasalization before monomorphemic and multimorphemic (scammer) ambisyllabic nasals differ from word-medial coda (gamble) and word-final nasals (scam), but not from other intervocalic nasals. Additionally, vowel nasalization is sensitive to the manner of the preceding phoneme. These findings demonstrate that quantifying anticipatory nasalization using nasometry differs from visual spectrographic criteria.


Assuntos
Fonética , Humanos , Masculino , Feminino , Adulto , Medida da Produção da Fala/métodos , Idioma , Nariz/fisiologia , Nariz/anatomia & histologia , Adulto Jovem , Acústica da Fala , Sinais (Psicologia)
4.
J Acoust Soc Am ; 156(1): 278-283, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38980102

RESUMO

How we produce and perceive voice is constrained by laryngeal physiology and biomechanics. Such constraints may present themselves as principal dimensions in the voice outcome space that are shared among speakers. This study attempts to identify such principal dimensions in the voice outcome space and the underlying laryngeal control mechanisms in a three-dimensional computational model of voice production. A large-scale voice simulation was performed with parametric variations in vocal fold geometry and stiffness, glottal gap, vocal tract shape, and subglottal pressure. Principal component analysis was applied to data combining both the physiological control parameters and voice outcome measures. The results showed three dominant dimensions accounting for at least 50% of the total variance. The first two dimensions describe respiratory-laryngeal coordination in controlling the energy balance between low- and high-frequency harmonics in the produced voice, and the third dimension describes control of the fundamental frequency. The dominance of these three dimensions suggests that voice changes along these principal dimensions are likely to be more consistently produced and perceived by most speakers than other voice changes, and thus are more likely to have emerged during evolution and be used to convey important personal information, such as emotion and larynx size.


Assuntos
Laringe , Fonação , Análise de Componente Principal , Humanos , Fenômenos Biomecânicos , Laringe/fisiologia , Laringe/anatomia & histologia , Voz/fisiologia , Prega Vocal/fisiologia , Prega Vocal/anatomia & histologia , Simulação por Computador , Qualidade da Voz , Acústica da Fala , Pressão , Modelos Biológicos , Modelos Anatômicos
5.
PLoS One ; 19(7): e0306272, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39028710

RESUMO

Abnormal speech prosody has been widely reported in individuals with autism. Many studies on children and adults with autism spectrum disorder speaking a non-tonal language showed deficits in using prosodic cues to mark focus. However, focus marking by autistic children speaking a tonal language is rarely examined. Cantonese-speaking children may face additional difficulties because tonal languages require them to use prosodic cues to achieve multiple functions simultaneously such as lexical contrasting and focus marking. This study bridges this research gap by acoustically evaluating the use of Cantonese speech prosody to mark information structure by Cantonese-speaking children with and without autism spectrum disorder. We designed speech production tasks to elicit natural broad and narrow focus production among these children in sentences with different tone combinations. Acoustic correlates of prosodic focus marking like f0, duration and intensity of each syllable were analyzed to examine the effect of participant group, focus condition and lexical tones. Our results showed differences in focus marking patterns between Cantonese-speaking children with and without autism spectrum disorder. The autistic children not only showed insufficient on-focus expansion in terms of f0 range and duration when marking focus, but also produced less distinctive tone shapes in general. There was no evidence that the prosodic complexity (i.e. sentences with single tones or combinations of tones) significantly affected focus marking in these autistic children and their typically-developing (TD) peers.


Assuntos
Transtorno do Espectro Autista , Idioma , Humanos , Transtorno do Espectro Autista/fisiopatologia , Transtorno do Espectro Autista/psicologia , Masculino , Feminino , Criança , Acústica da Fala , Pré-Escolar , Fala/fisiologia
6.
J Acoust Soc Am ; 156(1): 284-298, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38984810

RESUMO

This study investigated the effect of different types of phonetic training on potential changes in the production and perception of English vowels by Arabic learners of English. Forty-six Arabic learners of English were randomly assigned to one of three high variability vowel training programs: Perception training (High Variability Phonetic Training), Production training, and a Hybrid Training program (production and perception training). Pre- and post-tests (vowel identification, category discrimination, speech recognition in noise, and vowel production) showed that all training types led to improvements in perception and production. There was some evidence that improvements were linked to training type: learners in the Perception Training condition improved in vowel identification but not vowel production, while those in the Production Training condition showed only small improvements in performance on perceptual tasks, but greater improvement in production. However, the effects of training modality were complicated by proficiency, with high proficiency learners benefitting more from different types of training regardless of training mode than lower proficiency learners.


Assuntos
Multilinguismo , Fonética , Percepção da Fala , Humanos , Feminino , Masculino , Adulto Jovem , Adulto , Acústica da Fala , Aprendizagem , Medida da Produção da Fala , Reconhecimento Psicológico , Mascaramento Perceptivo , Ruído , Idioma , Adolescente
7.
J Acoust Soc Am ; 156(1): 489-502, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39013039

RESUMO

Anticipatory coarticulation is a highly informative cue to upcoming linguistic information: listeners can identify that the word is ben and not bed by hearing the vowel alone. The present study compares the relative performances of human listeners and a self-supervised pre-trained speech model (wav2vec 2.0) in the use of nasal coarticulation to classify vowels. Stimuli consisted of nasalized (from CVN words) and non-nasalized (from CVCs) American English vowels produced by 60 humans and generated in 36 TTS voices. wav2vec 2.0 performance is similar to human listener performance, in aggregate. Broken down by vowel type: both wav2vec 2.0 and listeners perform higher for non-nasalized vowels produced naturally by humans. However, wav2vec 2.0 shows higher correct classification performance for nasalized vowels, than for non-nasalized vowels, for TTS voices. Speaker-level patterns reveal that listeners' use of coarticulation is highly variable across talkers. wav2vec 2.0 also shows cross-talker variability in performance. Analyses also reveal differences in the use of multiple acoustic cues in nasalized vowel classifications across listeners and the wav2vec 2.0. Findings have implications for understanding how coarticulatory variation is used in speech perception. Results also can provide insight into how neural systems learn to attend to the unique acoustic features of coarticulation.


Assuntos
Fonética , Acústica da Fala , Percepção da Fala , Humanos , Feminino , Percepção da Fala/fisiologia , Masculino , Adulto , Adulto Jovem , Sinais (Psicologia) , Qualidade da Voz
8.
J Acoust Soc Am ; 155(6): 3877-3888, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38888391

RESUMO

The quality of speech input influences the efficiency of L1 and L2 acquisition. This study examined modifications in infant-directed speech (IDS) and foreigner-directed speech (FDS) in Standard Mandarin-a tonal language-and explored how IDS and FDS features were manifested in disyllabic words and a longer discourse. The study aimed to determine which characteristics of IDS and FDS were enhanced in comparison with adult-directed speech (ADS), and how IDS and FDS differed when measured in a common set of acoustic parameters. For words, it was found that tone-bearing vowel duration, mean and range of fundamental frequency (F0), and the lexical tone contours were enhanced in IDS and FDS relative to ADS, except for the dipping Tone 3 that exhibited an unexpected lowering in FDS, but no modification in IDS when compared with ADS. For the discourse, different aspects of temporal and F0 enhancements were emphasized in IDS and FDS: the mean F0 was higher in IDS whereas the total discourse duration was greater in FDS. These findings add to the growing literature on L1 and L2 speech input characteristics and their role in language acquisition.


Assuntos
Acústica da Fala , Humanos , Feminino , Masculino , Lactente , Adulto , Fonética , Medida da Produção da Fala/métodos , Adulto Jovem , Multilinguismo , Qualidade da Voz , Acústica , Idioma , Fatores de Tempo , Percepção da Fala
9.
PLoS One ; 19(6): e0304399, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38865318

RESUMO

This study aims to investigate the effect of detoxification on acoustic features of Mandarin speech. Speech recordings were collected from 66 male abstinent heroin users with different durations of drug detoxification, specifically early abstinent users with a detoxification duration of less than 2 years, sustained abstinent users with 2 years of detoxification, and long-term abstinent users with a detoxification duration of more than 2 years. The results of the acoustic analyses showed that early abstinent users exhibited lower loudness, relative energies of F1, F2, and F3, higher H1-A3, and fewer loudness peaks per second, as well as a longer average duration of unvoiced segments, compared to the sustained and long-term abstinent users. The findings suggest that detoxification may lead to a rehabilitation process in the speech production of abstinent heroin users (e.g., less vocal hoarseness). This study not only provides valuable insights into the effect of detoxification on speech production but also provides a theoretical basis for the speech rehabilitation and detoxification treatment of heroin users.


Assuntos
Dependência de Heroína , Acústica da Fala , Humanos , Masculino , Dependência de Heroína/fisiopatologia , Adulto , Fala/fisiologia , Adulto Jovem , Idioma , Inativação Metabólica
10.
J Acoust Soc Am ; 155(6): 3915-3929, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38904539

RESUMO

Speech recognition by both humans and machines frequently fails in non-optimal yet common situations. For example, word recognition error rates for second-language (L2) speech can be high, especially under conditions involving background noise. At the same time, both human and machine speech recognition sometimes shows remarkable robustness against signal- and noise-related degradation. Which acoustic features of speech explain this substantial variation in intelligibility? Current approaches align speech to text to extract a small set of pre-defined spectro-temporal properties from specific sounds in particular words. However, variation in these properties leaves much cross-talker variation in intelligibility unexplained. We examine an alternative approach utilizing a perceptual similarity space acquired using self-supervised learning. This approach encodes distinctions between speech samples without requiring pre-defined acoustic features or speech-to-text alignment. We show that L2 English speech samples are less tightly clustered in the space than L1 samples reflecting variability in English proficiency among L2 talkers. Critically, distances in this similarity space are perceptually meaningful: L1 English listeners have lower recognition accuracy for L2 speakers whose speech is more distant in the space from L1 speech. These results indicate that perceptual similarity may form the basis for an entirely new speech and language analysis approach.


Assuntos
Acústica da Fala , Inteligibilidade da Fala , Percepção da Fala , Humanos , Masculino , Feminino , Adulto , Adulto Jovem , Multilinguismo , Reconhecimento Psicológico , Ruído
11.
Braz J Med Biol Res ; 57: e13528, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38896645

RESUMO

Unilateral vocal cord paralysis is frequently observed in patients who undergo thyroid surgery. This study explored the correlation between acoustic voice analysis (objective measure) and Voice Handicap Index (VHI, a self-assessment tool). One hundred and forty patients who had thyroid surgery with or without postoperative unilateral vocal cord paralysis (PVCP and NPVCP) were included. The patients were evaluated by the VHI and Dysphonia Severity Index (DSI) tools. VHI scores were significantly higher in PVCP patients than in NPVCP patients. Jitter (%) and shimmer (%) were significantly increased, whereas DSI was significantly decreased in PVCP patients. Receiver operating characteristics curve revealed that VHI scores were associated with the diagnosis of PVCP, of which VHI total score yielded an area under the curve (AUC) of 0.81. Among acoustic parameters, DSI was highly associated to PVCP (AUC=0.82, 95%CI=0.75 to 0.89). Moreover, we found a correlation between VHI scores and voice acoustic parameters. Among them, DSI had a moderate correlation with functional and VHI scores, as suggested by an R value of 0.41 and 0.49, respectively. VHI scores and acoustic parameters were associated with the diagnosis of PVCP.


Assuntos
Índice de Gravidade de Doença , Tireoidectomia , Paralisia das Pregas Vocais , Qualidade da Voz , Humanos , Paralisia das Pregas Vocais/etiologia , Paralisia das Pregas Vocais/fisiopatologia , Paralisia das Pregas Vocais/diagnóstico , Masculino , Feminino , Pessoa de Meia-Idade , Adulto , Tireoidectomia/efeitos adversos , Complicações Pós-Operatórias/diagnóstico , Acústica da Fala , Idoso , Curva ROC , Avaliação da Deficiência , Disfonia/etiologia , Disfonia/diagnóstico , Disfonia/fisiopatologia
12.
Adv Exp Med Biol ; 1455: 257-274, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38918356

RESUMO

Speech can be defined as the human ability to communicate through a sequence of vocal sounds. Consequently, speech requires an emitter (the speaker) capable of generating the acoustic signal and a receiver (the listener) able to successfully decode the sounds produced by the emitter (i.e., the acoustic signal). Time plays a central role at both ends of this interaction. On the one hand, speech production requires precise and rapid coordination, typically within the order of milliseconds, of the upper vocal tract articulators (i.e., tongue, jaw, lips, and velum), their composite movements, and the activation of the vocal folds. On the other hand, the generated acoustic signal unfolds in time, carrying information at different timescales. This information must be parsed and integrated by the receiver for the correct transmission of meaning. This chapter describes the temporal patterns that characterize the speech signal and reviews research that explores the neural mechanisms underlying the generation of these patterns and the role they play in speech comprehension.


Assuntos
Fala , Humanos , Fala/fisiologia , Percepção da Fala/fisiologia , Acústica da Fala , Periodicidade
13.
J Acoust Soc Am ; 155(6): 3957-3967, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38921646

RESUMO

High-frequency speech information is susceptible to inaccurate perception in even mild to moderate forms of hearing loss. Some hearing aids employ frequency-lowering methods such as nonlinear frequency compression (NFC) to help hearing-impaired individuals access high-frequency speech information in more accessible lower-frequency regions. As such techniques cause significant spectral distortion, tests such as the S-Sh Confusion Test help optimize NFC settings to provide high-frequency audibility with the least distortion. Such tests have been traditionally based on speech contrasts pertinent to English. Here, the effects of NFC processing on fricative perception between English and Mandarin listeners are assessed. Small but significant differences in fricative discrimination were observed between the groups. The study demonstrates possible need for language-specific clinical fitting procedures for NFC.


Assuntos
Auxiliares de Audição , Percepção da Fala , Humanos , Feminino , Masculino , Adulto , Adulto Jovem , Idioma , Estimulação Acústica , Acústica da Fala , Limiar Auditivo
14.
Sci Rep ; 14(1): 12787, 2024 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-38834775

RESUMO

Cochlear implant users experience difficulties controlling their vocalizations compared to normal hearing peers. However, less is known about their voice quality. The primary aim of the present study was to determine if cochlear implant users' voice quality would be categorized as dysphonic by the Acoustic Voice Quality Index (AVQI) and smoothed cepstral peak prominence (CPPS). A secondary aim was to determine if vocal quality is further impacted when using bilateral implants compared to using only one implant. The final aim was to determine how residual hearing impacts voice quality. Twenty-seven cochlear implant users participated in the present study and were recorded while sustaining a vowel and while reading a standardized passage. These recordings were analyzed to calculate the AVQI and CPPS. The results indicate that CI users' voice quality was detrimentally affected by using their CI, raising to the level of a dysphonic voice. Specifically, when using their CI, mean AVQI scores were 4.0 and mean CPPS values were 11.4 dB, which indicates dysphonia. There were no significant differences in voice quality when comparing participants with bilateral implants to those with one implant. Finally, for participants with residual hearing, as hearing thresholds worsened, the likelihood of a dysphonic voice decreased.


Assuntos
Implantes Cocleares , Qualidade da Voz , Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Adulto , Disfonia/fisiopatologia , Acústica da Fala , Implante Coclear
15.
J Speech Lang Hear Res ; 67(7): 1997-2020, 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38861454

RESUMO

PURPOSE: Although different factors and voice measures have been associated with phonotraumatic vocal hyperfunction (PVH), it is unclear what percentage of individuals with PVH exhibit such differences during their daily lives. This study used a machine learning approach to quantify the consistency with which PVH manifests according to ambulatory voice measures. Analyses included acoustic parameters of phonation as well as temporal aspects of phonation and rest, with the goal of determining optimally consistent signatures of PVH. METHOD: Ambulatory neck-surface acceleration signals were recorded over 1 week from 116 female participants diagnosed with PVH and age-, sex-, and occupation-matched vocally healthy controls. The consistency of the manifestation of PVH was defined as the percentage of participants in each group that exhibited an atypical signature based on a target voice measure. Evaluation of each machine learning model used nested 10-fold cross-validation to improve the generalizability of findings. In Experiment 1, we trained separate logistic regression models based on the distributional characteristics of 14 voice measures and durations of voicing and resting segments. In Experiments 2 and 3, features of voicing and resting duration augmented the existing distributional characteristics to examine whether more consistent signatures would result. RESULTS: Experiment 1 showed that the difference in the magnitude of the first two harmonics (H1-H2) exhibited the most consistent signature (69.4% of participants with PVH and 20.4% of controls had an atypical H1-H2 signature), followed by spectral tilt over eight harmonics (73.6% participants with PVH and 32.1% of controls had an atypical spectral tilt signature) and estimated sound pressure level (SPL; 66.9% participants with PVH and 27.6% of controls had an atypical SPL signature). Additionally, 77.6% of participants with PVH had atypical resting duration, with 68.9% exhibiting atypical voicing duration. Experiments 2 and 3 showed that augmenting the best-performing voice measures with univariate features of voicing or resting durations yielded only incremental improvement in the classifier's performance. CONCLUSIONS: Females with PVH were more likely to use more abrupt vocal fold closure (lower H1-H2), phonate louder (higher SPL), and take shorter vocal rests. They were also less likely to use higher fundamental frequency during their daily activities. The difference in the voicing duration signature between participants with PVH and controls had a large effect size, providing strong empirical evidence regarding the role of voice use in the development of PVH.


Assuntos
Aprendizado de Máquina , Fonação , Humanos , Feminino , Adulto , Pessoa de Meia-Idade , Fonação/fisiologia , Distúrbios da Voz/fisiopatologia , Distúrbios da Voz/diagnóstico , Adulto Jovem , Qualidade da Voz/fisiologia , Prega Vocal/fisiopatologia , Acústica da Fala , Voz/fisiologia , Idoso , Estudos de Casos e Controles
16.
J Acoust Soc Am ; 155(6): 3983-3994, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38934563

RESUMO

Advancing age is associated with decreased sensitivity to temporal cues in word segments, particularly when target words follow non-informative carrier sentences or are spectrally degraded (e.g., vocoded to simulate cochlear-implant stimulation). This study investigated whether age, carrier sentences, and spectral degradation interacted to cause undue difficulty in processing speech temporal cues. Younger and older adults with normal hearing performed phonemic categorization tasks on two continua: a Buy/Pie contrast with voice onset time changes for the word-initial stop and a Dish/Ditch contrast with silent interval changes preceding the word-final fricative. Target words were presented in isolation or after non-informative carrier sentences, and were unprocessed or degraded via sinewave vocoding (2, 4, and 8 channels). Older listeners exhibited reduced sensitivity to both temporal cues compared to younger listeners. For the Buy/Pie contrast, age, carrier sentence, and spectral degradation interacted such that the largest age effects were seen for unprocessed words in the carrier sentence condition. This pattern differed from the Dish/Ditch contrast, where reducing spectral resolution exaggerated age effects, but introducing carrier sentences largely left the patterns unchanged. These results suggest that certain temporal cues are particularly susceptible to aging when placed in sentences, likely contributing to the difficulties of older cochlear-implant users in everyday environments.


Assuntos
Estimulação Acústica , Envelhecimento , Sinais (Psicologia) , Percepção da Fala , Humanos , Percepção da Fala/fisiologia , Idoso , Adulto Jovem , Adulto , Fatores Etários , Envelhecimento/psicologia , Envelhecimento/fisiologia , Pessoa de Meia-Idade , Fatores de Tempo , Feminino , Masculino , Acústica da Fala , Fonética , Audiometria da Fala , Idoso de 80 Anos ou mais , Adolescente , Inteligibilidade da Fala
17.
Am J Speech Lang Pathol ; 33(4): 1930-1951, 2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38838243

RESUMO

PURPOSE: This study investigated the effects of the SPEAK OUT! & LOUD Crowd therapy program on speaking rate, percent pause time, intelligibility, naturalness, and communicative participation in individuals with Parkinson's disease (PD). METHOD: Six adults with PD completed 12 individual SPEAK OUT! sessions across four consecutive weeks followed by group-based LOUD Crowd sessions for five consecutive weeks. Most therapy sessions were conducted via telehealth, with two participants completing the SPEAK OUT! portion in person. Speech samples were recorded at six time points: three baseline time points prior to SPEAK OUT!, two post-SPEAK OUT! time points, and one post-LOUD Crowd time point. Acoustic measures of speaking rate and percent pause time and listener ratings of speech intelligibility and naturalness were obtained for each time point. Participant self-ratings of communicative participation were also collected at pre- and posttreatment time points. RESULTS: Results showed significant improvement in communicative participation scores at a group level following completion of the SPEAK OUT! & LOUD Crowd treatment program. Two participants showed a significant decrease in speaking rate and increase in percent pause time following treatment. Changes in intelligibility and naturalness were not statistically significant. CONCLUSIONS: These findings provide preliminary support for the effectiveness of the SPEAK OUT! & LOUD Crowd treatment program in improving communicative participation for people with mild-to-moderate hypokinetic dysarthria secondary to PD. This study is also the first to demonstrate positive effects of this treatment program for people receiving the therapy via telehealth.


Assuntos
Doença de Parkinson , Inteligibilidade da Fala , Medida da Produção da Fala , Fonoterapia , Humanos , Doença de Parkinson/complicações , Doença de Parkinson/terapia , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Fonoterapia/métodos , Disartria/etiologia , Disartria/terapia , Disartria/reabilitação , Resultado do Tratamento , Acústica da Fala , Fatores de Tempo , Qualidade da Voz , Telemedicina
18.
J Acoust Soc Am ; 155(6): 3848-3860, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38884524

RESUMO

The ability to accurately classify accents and assess accentedness in non-native speakers are challenging tasks due primarily to the complexity and diversity of accent and dialect variations. In this study, embeddings from advanced pretrained language identification (LID) and speaker identification (SID) models are leveraged to improve the accuracy of accent classification and non-native accentedness assessment. Findings demonstrate that employing pretrained LID and SID models effectively encodes accent/dialect information in speech. Furthermore, the LID and SID encoded accent information complement an end-to-end (E2E) accent identification (AID) model trained from scratch. By incorporating all three embeddings, the proposed multi-embedding AID system achieves superior accuracy in AID. Next, leveraging automatic speech recognition (ASR) and AID models is investigated to explore accentedness estimation. The ASR model is an E2E connectionist temporal classification model trained exclusively with American English (en-US) utterances. The ASR error rate and en-US output of the AID model are leveraged as objective accentedness scores. Evaluation results demonstrate a strong correlation between scores estimated by the two models. Additionally, a robust correlation between objective accentedness scores and subjective scores based on human perception is demonstrated, providing evidence for the reliability and validity of using AID-based and ASR-based systems for accentedness assessment in non-native speech. Such advanced systems would benefit accent assessment in language learning as well as speech and speaker assessment for intelligibility, quality, and speaker diarization and speech recognition advancements.


Assuntos
Percepção da Fala , Interface para o Reconhecimento da Fala , Humanos , Percepção da Fala/fisiologia , Acústica da Fala , Fonética , Idioma , Medida da Produção da Fala/métodos , Feminino , Masculino
19.
JASA Express Lett ; 4(6)2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38884558

RESUMO

Age-related changes in auditory processing may reduce physiological coding of acoustic cues, contributing to older adults' difficulty perceiving speech in background noise. This study investigated whether older adults differed from young adults in patterns of acoustic cue weighting for categorizing vowels in quiet and in noise. All participants relied primarily on spectral quality to categorize /ɛ/ and /æ/ sounds under both listening conditions. However, relative to young adults, older adults exhibited greater reliance on duration and less reliance on spectral quality. These results suggest that aging alters patterns of perceptual cue weights that may influence speech recognition abilities.


Assuntos
Sinais (Psicologia) , Mascaramento Perceptivo , Percepção da Fala , Humanos , Percepção da Fala/fisiologia , Idoso , Adulto Jovem , Feminino , Masculino , Adulto , Mascaramento Perceptivo/fisiologia , Ruído/efeitos adversos , Envelhecimento/fisiologia , Envelhecimento/psicologia , Acústica da Fala , Pessoa de Meia-Idade , Fonética , Fatores Etários , Adolescente
20.
JASA Express Lett ; 4(6)2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38869383

RESUMO

This study investigated the acoustic cue weighting of the Korean stop contrast in the perception and production of speakers who moved from a nonstandard dialect region to the standard dialect region, Seoul. Through comparing these mobile speakers with data from nonmobile speakers in Seoul and their home region, it was found that the speakers shifted their cue weighting in perception and production to some degree, but also retained some subphonemic features of their home dialect in production. The implications of these results for the role of dialect prestige and awareness in second dialect acquisition are discussed.


Assuntos
Percepção da Fala , Humanos , Masculino , Feminino , República da Coreia , Fonética , Idioma , Adulto , Acústica da Fala , Sinais (Psicologia) , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...