Búsqueda | Portal Regional de la BVS

1.

Digital remote assessment of speech acoustics in cognitively unimpaired adults: feasibility, reliability and associations with amyloid pathology.

van den Berg, Rosanne L; de Boer, Casper; Zwan, Marissa D; Jutten, Roos J; van Liere, Mariska; van de Glind, Marie-Christine A B J; Dubbelman, Mark A; Schlüter, Lisa Marie; van Harten, Argonde C; Teunissen, Charlotte E; van de Giessen, Elsmarieke; Barkhof, Frederik; Collij, Lyduine E; Robin, Jessica; Simpson, William; Harrison, John E; van der Flier, Wiesje M; Sikkes, Sietske A M.

Alzheimers Res Ther ; 16(1): 176, 2024 Aug 01.

Artículo en Inglés | MEDLINE | ID: mdl-39090738

RESUMEN

BACKGROUND: Digital speech assessment has potential relevance in the earliest, preclinical stages of Alzheimer's disease (AD). We evaluated the feasibility, test-retest reliability, and association with AD-related amyloid-beta (Aß) pathology of speech acoustics measured over multiple assessments in a remote setting. METHODS: Fifty cognitively unimpaired adults (Age 68 ± 6.2 years, 58% female, 46% Aß-positive) completed remote, tablet-based speech assessments (i.e., picture description, journal-prompt storytelling, verbal fluency tasks) for five days. The testing paradigm was repeated after 2-3 weeks. Acoustic speech features were automatically extracted from the voice recordings, and mean scores were calculated over the 5-day period. We assessed feasibility by adherence rates and usability ratings on the System Usability Scale (SUS) questionnaire. Test-retest reliability was examined with intraclass correlation coefficients (ICCs). We investigated the associations between acoustic features and Aß-pathology, using linear regression models, adjusted for age, sex and education. RESULTS: The speech assessment was feasible, indicated by 91.6% adherence and usability scores of 86.0 ± 9.9. High reliability (ICC ≥ 0.75) was found across averaged speech samples. Aß-positive individuals displayed a higher pause-to-word ratio in picture description (B = -0.05, p = 0.040) and journal-prompt storytelling (B = -0.07, p = 0.032) than Aß-negative individuals, although this effect lost significance after correction for multiple testing. CONCLUSION: Our findings support the feasibility and reliability of multi-day remote assessment of speech acoustics in cognitively unimpaired individuals with and without Aß-pathology, which lays the foundation for the use of speech biomarkers in the context of early AD.

Asunto(s)

Estudios de Factibilidad , Acústica del Lenguaje , Humanos , Femenino , Masculino , Anciano , Reproducibilidad de los Resultados , Persona de Mediana Edad , Enfermedad de Alzheimer/diagnóstico , Péptidos beta-Amiloides , Habla/fisiología

2.

Cepstral Analysis of Voice in Patients With Temporomandibular Disorders.

Jamshidpour, Parizad; Moradi, Negin; Raiesian, Shahrokh; Shaterzadeh Yazdi, Mohammad Jafar; Soltani, Majid; Seyedtabib, Maryam; Masoudrad, Mahdis; Nourbakhsh, Mandana.

Ann Otol Rhinol Laryngol ; : 34894241264938, 2024 Jul 25.

Artículo en Inglés | MEDLINE | ID: mdl-39054799

RESUMEN

OBJECTIVES: This study aimed to assess the voice quality of patients with temporomandibular disorders (TMDs) compared with healthy subjects using cepstral analysis and investigate the relationship between the TMD severity and the values of cepstral analysis. METHODS: Subjects who met the inclusion criteria completed a general health questionnaire and the Fonseca Anamnestic Index. Patients who had TMDs with FAI were subjected to an examination based on the Diagnostic Criteria for Temporomandibular Disorders. The final sample included 65 subjects, 31 TMDs patients (with a mean age ± standard deviation of 36.64 ± 13.67 years), and 34 healthy individuals in the control group (with a mean age ± standard deviation of 30.35 ± 7.78 years). Cepstral Peak Prominence (CPP) and Smoothened Cepstral Peak Prominence (CPPS) of a sustained vowel and connected speech were computed using Praat software. RESULTS: TMD patients indicated lower cepstral values and lower voice quality compared to the control group. Significant differences were found between TMD and control groups for all cepstral parameters (P < .001) and cepstral measurements showed a moderate to strong negative correlation with TMD severity (P < .001, rho = -0.57 to -0.88). CONCLUSION: The outcomes of the present study indicate that cepstral analysis can accurately distinguish the reduced voice quality of TMD patients from normal voice.

3.

Deep Learning for Neuromuscular Control of Vocal Source for Voice Production.

Palaparthi, Anil; Alluri, Rishi K; Titze, Ingo R.

Appl Sci (Basel) ; 14(2)2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-39071945

RESUMEN

A computational neuromuscular control system that generates lung pressure and three intrinsic laryngeal muscle activations (cricothyroid, thyroarytenoid, and lateral cricoarytenoid) to control the vocal source was developed. In the current study, LeTalker, a biophysical computational model of the vocal system was used as the physical plant. In the LeTalker, a three-mass vocal fold model was used to simulate self-sustained vocal fold oscillation. A constant/Ç/vowel was used for the vocal tract shape. The trachea was modeled after MRI measurements. The neuromuscular control system generates control parameters to achieve four acoustic targets (fundamental frequency, sound pressure level, normalized spectral centroid, and signal-to-noise ratio) and four somatosensory targets (vocal fold length, and longitudinal fiber stress in the three vocal fold layers). The deep-learning-based control system comprises one acoustic feedforward controller and two feedback (acoustic and somatosensory) controllers. Fifty thousand steady speech signals were generated using the LeTalker for training the control system. The results demonstrated that the control system was able to generate the lung pressure and the three muscle activations such that the four acoustic and four somatosensory targets were reached with high accuracy. After training, the motor command corrections from the feedback controllers were minimal compared to the feedforward controller except for thyroarytenoid muscle activation.

4.

Relação entre meditação, regulação emocional e expressão vocal: estudo de intervenção / Relación entre meditación, regulación emocional y expresión vocal: estudio de intervención / The relationship between meditation, emotional regulation and vocal expression: interventional study

Lopez, Mariana; Ghirardi, Ana Carolina A. M; Vicente, Frantiesca dos Anjos; Menezes, Carolina Baptista.

Distúrbios Comun. (Online) ; 36(1): e65819, 17/06/2024.

Artículo en Inglés, Portugués | LILACS | ID: biblio-1563122

RESUMEN

Introdução: A voz é um indicador de estados emocionais, influenciada por fatores como o tônus vagal, a respiração e a variabilidade da frequência cardíaca. O estudo explora esses fatores e a relação com a regulação emocional e a prática meditativa como técnica de autorregulação. Objetivo: Investigar a diferença nas características vocais e na variação da frequência cardíaca em meditadores experientes (EM) e novatos (NM) antes e depois de uma prática meditativa e em não praticantes de meditação grupo controle (CG), antes e depois de um teste controle. Métodos: Estudo quase-fatorial 3 x 2. Três grupos foram avaliados (meditadores experientes EM; meditadores novatos NM; e grupo controle CG, não praticantes de meditação) em dois momentos da manipulação experimental antes e depois de uma sessão meditativa para praticantes de meditação, e antes e depois de uma tarefa de busca de palavras para o grupo controle. A frequência fundamental, jitter, shimmer, relação harmônico-ruído e o primeiro (F1), o segundo (F2) e terceiro (F3) formantes da vogal [a]; a variação da frequência cardíaca (SDNN, RMSSD, LF/HF, SD1 and SD2); estado de ansiedade e autopercepção vocal, foram investigados, antes e após a intervenção. Resultados: O grupo EM alcançou ótimo relaxamento do trato vocal. Os grupos NM e CG apresentaram mudanças em F1. Prática meditativa, de longa duração, está associado com grande diferença em F3, SDNN e SD2 na variação da frequência cardíaca. Conclusão: Os resultados sugerem que prática meditativa influencia a expressão vocal e reação emocional, e que a experiência em prática meditativa favorece esta relação. (AU)

Introduction: The voice is an indicator of emotional states, influenced by factors such as vagal tone, breathing and heart rate variability. This study explores these factors and their relationship with emotional regulation and meditative practice as a self-regulation technique. Purpose: To investigate the difference in vocal characteristics and heart rate variability in experienced (EM) and novice (NM) meditators before and after a meditation practice and in non-meditators - control group (CG), before and after a control test. Methods: 3 x 2 quasi-factorial study. Three groups were evaluated (experienced meditators EM; novice meditators NM; and control group CG, non-meditators) at two points in the experimental manipulation - before and after a meditation session for meditators, and before and after a word search task for the control group. The fundamental frequency, jitter, shimmer, harmonic-to-noise ratio and the first (F1), second (F2) and third (F3) formants of the vowel [a]; heart rate variation (SDNN, RMSSD, LF/HF, SD1 and SD2); anxiety state and vocal self-perception, were investigated, before and after the intervention. Results: The EM group achieved optimal vocal tract relaxation. The NM and CG groups showed changes in F1. Long-term meditative practice was associated with a large difference in F3, SDNN and SD2 in heart rate variation. Conclusion: The results suggest that meditation practice influences vocal expression and emotional reaction, and that experience in meditation practice favors this relationship. (AU)

Introducción: La voz es un indicador de los estados emocionales, influida por factores como el tono vagal, la respiración y la variabilidad de la frecuencia cardiaca. Este estudio explora estos factores y su relación con la regulación emocional y la práctica de la meditación. Objetivo: Investigar la diferencia en las características vocales y variabilidad de la frecuencia cardiaca en meditadores experimentados (EM) y novatos (NM) antes y después de una práctica de meditación y en no meditadores - grupo control (GC), antes y después de una prueba control. Métodos: Estudio cuasi-factorial 3 x 2. Se evaluaron tres grupos (meditadores experimentados EM; meditadores novatos NM; y grupo control CG, no meditadores) en dos momentos - antes y después de una sesión de meditación para los meditadores, y antes y después de una tarea de búsqueda de palabras para el grupo control. Se investigaron la frecuencia fundamental, jitter, shimmer, relación armónico-ruido y los formantes primero (F1), segundo (F2) y tercero (F3) de la vocal [a]; la variación de la frecuencia cardiaca (SDNN, RMSSD, LF/HF, SD1 y SD2); el estado de ansiedad y autopercepción vocal, antes y después de la intervención. Resultados: El grupo EM consiguió una relajación óptima del tracto vocal. Los grupos NM y CG mostraron cambios en F1. La práctica de meditación a largo plazo se asocia con una gran diferencia en F3, SDNN y SD2 en la variación de la frecuencia cardiaca. Conclusión: Los resultados sugieren que la práctica de meditación influye en la expresión vocal y reacción emocional. (AU)

Asunto(s)

Humanos , Masculino , Femenino , Adulto , Voz , Meditación , Regulación Emocional , Estudios Controlados Antes y Después , Reconocimiento de Voz/fisiología

5.

Correcting the record: Phonetic potential of primate vocal tracts and the legacy of Philip Lieberman (1934-2022).

Ekström, Axel G.

Am J Primatol ; 86(8): e23637, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38741274

RESUMEN

The phonetic potential of nonhuman primate vocal tracts has been the subject of considerable contention in recent literature. Here, the work of Philip Lieberman (1934-2022) is considered at length, and two research papers-both purported challenges to Lieberman's theoretical work-and a review of Lieberman's scientific legacy are critically examined. I argue that various aspects of Lieberman's research have been consistently misinterpreted in the literature. A paper by Fitch et al. overestimates the would-be "speech-ready" capacities of a rhesus macaque, and the data presented nonetheless supports Lieberman's principal position-that nonhuman primates cannot articulate the full extent of human speech sounds. The suggestion that no vocal anatomical evolution was necessary for the evolution of human speech (as spoken by all normally developing humans) is not supported by phonetic or anatomical data. The second challenge, by Boë et al., attributes vowel-like qualities of baboon calls to articulatory capacities based on audio data; I argue that such "protovocalic" properties likely result from disparate articulatory maneuvers compared to human speakers. A review of Lieberman's scientific legacy by Boë et al. ascribes a view of speech evolution (which the authors term "laryngeal descent theory") to Lieberman, which contradicts his writings. The present article documents a pattern of incorrect interpretations of Lieberman's theoretical work in recent literature. Finally, the apparent trend of vowel-like formant dispersions in great ape vocalization literature is discussed with regard to Lieberman's theoretical work. The review concludes that the "Lieberman account" of primate vocal tract phonetic capacities remains supported by research: the ready articulation of fully human speech reflects species-unique anatomy.

Asunto(s)

Fonética , Primates , Vocalización Animal , Animales , Primates/fisiología , Primates/anatomía & histología , Humanos , Historia del Siglo XX , Habla/fisiología , Evolución Biológica

6.

Effects of speech rate modifications on phonatory acoustic outcomes in Parkinson's disease.

Knowles, Thea; Adams, Scott G; Jog, Mandar.

Front Hum Neurosci ; 18: 1331816, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38450224

RESUMEN

Speech rate reduction is a global speech therapy approach for speech deficits in Parkinson's disease (PD) that has the potential to result in changes across multiple speech subsystems. While the overall goal of rate reduction is usually improvements in speech intelligibility, not all people with PD benefit from this approach. Speech rate is often targeted as a means of improving articulatory precision, though less is known about rate-induced changes in other speech subsystems that could help or hinder communication. The purpose of this study was to quantify phonatory changes associated with speech rate modification across a broad range of speech rates from very slow to very fast in talkers with and without PD. Four speaker groups participated: younger and older healthy controls, and people with PD with and without deep brain stimulation of the subthalamic nucleus (STN-DBS). Talkers read aloud standardized sentences at 7 speech rates elicited using magnitude production: habitual, three slower rates, and three faster rates. Acoustic measures of speech intensity, cepstral peak prominence, and fundamental frequency were measured as a function of speech rate and group. Overall, slower rates of speech were associated with differential effects on phonation across the four groups. While all talkers spoke at a lower pitch in slow speech, younger talkers showed increases in speech intensity and cepstral peak prominence, while talkers with PD and STN-DBS showed the reverse pattern. Talkers with PD without STN-DBS and older healthy controls behaved in between these two extremes. At faster rates, all groups uniformly demonstrated increases in cepstral peak prominence. While speech rate reductions are intended to promote positive changes in articulation to compensate for speech deficits in dysarthria, the present results highlight that undesirable changes may be invoked across other subsystems, such as at the laryngeal level. In particular, talkers with STN-DBS, who often demonstrate speech deterioration following DBS surgery, demonstrated more phonatory detriments at slowed speech rates. Findings have implications for speech rate candidacy considerations and speech motor control processes in PD.

7.

Perceptual-Auditory and Acoustic Analysis of Breathiness in Cis and Transgender Men and Women.

Canal, Marina F; Santos, Aline O; Sanchez, Julia F; Wolf, Aline E; Silverio, Kelly C A; Brasolotto, Alcione G.

J Voice ; 2024 Mar 30.

Artículo en Inglés | MEDLINE | ID: mdl-38556379

RESUMEN

OBJECTIVE: To verify breathiness in the cisgender and transgender men and women's voices, compare values of acoustic and perceptual indicators of breathiness and fundamental frequency (f0) between groups, and compare them between the voices attributed as female and male. STUDY DESIGN: Cross sectional retrospective study. METHODS: The study was approved by the Research Ethics Committee (4,937,140). Sustained vowel /a/ and continuous speech recordings of 21 cisgender men (CISM), 31 transgender men (TM), 32 cisgender women (CISW), and 31 transgender women (TW) were analyzed. Three judges conducted a perceptive-auditory analysis regarding the degree breathiness, using a visual analog scale, and attributed gender (female or male). The ABI (Acoustic Breathiness Index) was extracted using the PRAAT software (6.1.16). The f0, Harmonic-Noise Ratio (HNR), Voice Turbulence Index (VTI), and Soft Phonation Index (SPI) were analyzed using the Multi-Dimensional Voice Program (KayPentax). RESULTS: The ABI value for CISM was lower than for TM and CISW. CISW had a higher f0 than; TM had a higher f0 than CISM; and TW had a higher f0 than CISM. The groups did not differ for HNR and VTI. Regarding the SPI, CISM had higher values than CISW. Regarding the auditory perception, TM presented more intense breathiness than CISM in the vowel. Regarding gender attribution by voice, the voices CISM and CISW were 100% identified as male and female. On the other hand, in the vowel analysis, 45.2% of the TM voices were perceived as female, and 59.4% of TW voices as male. CONCLUSION: Breathiness occurs differently between groups and the voices perceived as male and female. Even when TM is submitted to the use of testosterone and undergoes vocal changes, the transglottal airflow remains, which is a female characteristic of phonation.

8.

Unveiling the sound of the cognitive status: Machine Learning-based speech analysis in the Alzheimer's disease spectrum.

García-Gutiérrez, Fernando; Alegret, Montserrat; Marquié, Marta; Muñoz, Nathalia; Ortega, Gemma; Cano, Amanda; De Rojas, Itziar; García-González, Pablo; Olivé, Clàudia; Puerta, Raquel; García-Sanchez, Ainhoa; Capdevila-Bayo, María; Montrreal, Laura; Pytel, Vanesa; Rosende-Roca, Maitee; Zaldua, Carla; Gabirondo, Peru; Tárraga, Lluís; Ruiz, Agustín; Boada, Mercè; Valero, Sergi.

Alzheimers Res Ther ; 16(1): 26, 2024 02 02.

Artículo en Inglés | MEDLINE | ID: mdl-38308366

RESUMEN

BACKGROUND: Advancement in screening tools accessible to the general population for the early detection of Alzheimer's disease (AD) and prediction of its progression is essential for achieving timely therapeutic interventions and conducting decentralized clinical trials. This study delves into the application of Machine Learning (ML) techniques by leveraging paralinguistic features extracted directly from a brief spontaneous speech (SS) protocol. We aimed to explore the capability of ML techniques to discriminate between different degrees of cognitive impairment based on SS. Furthermore, for the first time, this study investigates the relationship between paralinguistic features from SS and cognitive function within the AD spectrum. METHODS: Physical-acoustic features were extracted from voice recordings of patients evaluated in a memory unit who underwent a SS protocol. We implemented several ML models evaluated via cross-validation to identify individuals without cognitive impairment (subjective cognitive decline, SCD), with mild cognitive impairment (MCI), and with dementia due to AD (ADD). In addition, we established models capable of predicting cognitive domain performance based on a comprehensive neuropsychological battery from Fundació Ace (NBACE) using SS-derived information. RESULTS: The results of this study showed that, based on a paralinguistic analysis of sound, it is possible to identify individuals with ADD (F1 = 0.92) and MCI (F1 = 0.84). Furthermore, our models, based on physical acoustic information, exhibited correlations greater than 0.5 for predicting the cognitive domains of attention, memory, executive functions, language, and visuospatial ability. CONCLUSIONS: In this study, we show the potential of a brief and cost-effective SS protocol in distinguishing between different degrees of cognitive impairment and forecasting performance in cognitive domains commonly affected within the AD spectrum. Our results demonstrate a high correspondence with protocols traditionally used to assess cognitive function. Overall, it opens up novel prospects for developing screening tools and remote disease monitoring.

Asunto(s)

Enfermedad de Alzheimer , Disfunción Cognitiva , Humanos , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/psicología , Habla , Pruebas Neuropsicológicas , Disfunción Cognitiva/diagnóstico , Disfunción Cognitiva/psicología , Cognición , Aprendizaje Automático , Progresión de la Enfermedad

9.

Auditory-Perceptual Assessment and Acoustic Analysis of Gender Expression in the Voice.

Martinho, Diego Henrique da Cruz; Constantini, Ana Carolina.

J Voice ; 2024 Feb 08.

Artículo en Inglés | MEDLINE | ID: mdl-38336566

RESUMEN

OBJECTIVE: Determine if acoustic measurements exist that are predictive of Auditory-Perceptual Assessment (APA) of gender expression in the voice of transgender, nonbinary, and cisgender Brazilian speakers by transgender, nonbinary, and cisgender judges, as well as speech-language pathologists in the area of voice studies. METHODS: Cross-sectional study. Clips of speech (automatic speech and expressive reading of poetry) and sustained vowel emission of people of different genders were recorded and underwent APA for gender expression in the voice using a visual analog scale across 100 points, ranging from very masculine to very feminine. Sixteen acoustic measurements were extracted (noise, perturbation, spectral, and cepstral measurements). A descriptive and inferential analysis was performed using interclass coefficients of correlation and stepwise multiple linear regression, considering P < 0.05 for statistical significance. RESULTS: Forty-seven people of different genders had their voices recorded. The perceived gender of these voices was judged by 236 people (65 speech-language pathologists, 101 cisgender people, and 70 transgender and nonbinary people). The perceptions and measurements that were predictive of gender perception in the voice differed according to the task (vowel or speech) and the group of judges. The predictive acoustic measurements that were common in all groups were: speech-median F0, harmonic-to-noise ratio (HNR), F0 standard deviation (F0sd), average width between F0 peaks, and spectral emphasis (Emph); vowels-median F0, HNR, F0sd, and average width between F0 peaks. Divergent measurements between groups were: speech-coefficient of variation of intensity, speech rate (Sr), minimum and maximum F0, jitter, and shimmer; vowels-coefficient of variation of intensity, Emph, Sr, and minimum and maximum F0. CONCLUSION: There are acoustic measures that may predict APA; however, each group of judges considers different measures to evaluate gender, revealing an important influence of context on the evaluator in gender assessment through the voice.

10.

Sketches of chimpanzee (Pan troglodytes) hoo's: vowels by any other name?

Ekström, Axel G; Edlund, Jens.

Primates ; 65(2): 81-88, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38110671

RESUMEN

In human speech, the close back rounded vowel /u/ (the vowel in "boot") is articulated with the tongue arched toward the dorsal boundary of the hard palate, with the pharyngeal cavity open. Acoustic and perceptual properties of chimpanzee (Pan troglodytes) hoo's are similar to those of the human vowel /u/. However, the vocal tract morphology of chimpanzees likely limits their phonetic capabilities, so that it is unlikely, or even impossible, that their articulation is comparable to that of a human. To determine how qualities of the vowel /u/ may be achieved given the chimpanzee vocal tract, we calculated transfer functions of the vocal tract area for tube models of vocal tract configurations in which vocal tract length, length and area of a laryngeal air sac simulacrum, length of lip protrusion, and area of lip opening were systematically varied. The method described is principally acoustic; we make no claim as to the actual shape of the chimpanzee vocal tract during call production. Nonetheless, we demonstrate that it may be possible to achieve the acoustic and perceptual qualities of back vowels without a reconfigured human vocal tract. The results, while tentative, suggest that the production of hoo's by chimpanzees, while achieving comparable vowel-like qualities to the human /u/, may involve articulatory gestures that are beyond the range of the human articulators. The purpose of this study was to (1) stimulate further simulation research on great ape articulation, and (2) show that apparently vowel-like phenomena in nature are not necessarily indicative of evolutionary continuity per se.

Asunto(s)

Pan troglodytes , Acústica del Lenguaje , Animales , Humanos , Habla , Fonética , Lengua

11.

Did the speech of patients with Myasthenia Gravis decline over 4 years? / Há declínio na fala de pacientes com Miastenia Gravis ao longo de 4 anos?

Ayres, Annelise; Padovani, Marina Martins Pereira; Olchik, Maira Rozenfeld; Kieling, Maiara Laís Mallmann; Santos, Vanessa Brzoskowski dos; Rothe-Neves, Rui; Behlau, Mara.

CoDAS ; 36(2): e20230055, 2024. tab

Artículo en Inglés | LILACS-Express | LILACS | ID: biblio-1520737

RESUMEN

ABSTRACT Purpose To compare the speech and voice patterns of myasthenia gravis (MG) patients over four years, and correlate the results with clinical aspects of the disease. Methods Data was collected for 4 years. The clinical assessment tools included the Quantitative Myasthenia Gravis (QMG) score, the Myasthenia Gravis Foundation of America (MGFA) clinical classification, and the Myasthenia Gravis Quality of Life 15-item Scale (MG-QoL). To assess speech, the recorded speaking tasks were analyzed acoustically and given auditory-perceptual ratings. Sex (equal distribution) and age (p=0.949) were used as matching criteria in the final sample, which consisted of 10 individuals in the MG group (MGG) and 10 individuals in the control group (CG). Results After 4 years, the MG participants presented stable health status, increased mild and moderate dysarthria (from 40% to 90% of the subjects), and a significant deterioration in the respiration, phonation, and articulation subsystems. The acoustic analysis showed a decline in articulatory patterns (speech rate p=0.047, articulation rate p=0.007, mean syllable duration p=0.007) and vocal quality (increased jitter p=0.022). In the follow-up comparison, there was a significant difference between the phonation variables (shimmer and harmonic-to-noise ratio) of the MGG and CG. Conclusion The MG patients presented a decline in speech over four years and an increase in mild and moderate dysarthria. Despite presenting stable health status, their respiratory, phonatory, and articulatory subsystems worsened. There was no correlation between speech patterns and clinical characteristics of the disease (severity and motor scale).

RESUMO Objetivo Comparar o padrão de fala e voz de indivíduos com Miastenia Gravis (MG) em um intervalo de quatro anos e correlacionar com aspectos clínicos da doença. Método A coleta de dados foi realizada ao longo de 4 anos. A avaliação clínica foi composta pelo Quantitative Myasthenia Gravis Score (QMGS), pela Myasthenia Gravis Foundation of America Classification (MGFA) e pela escala de qualidade de vida para Miastenia Gravis (MG-QOL). A avaliação da fala foi composta por gravação de tarefas, análise perceptivo-auditiva e análise acústica. A amostra final foi composta por 10 indivíduos em MG e 10 indivíduos no grupo controle (GC), pareados por sexo (distribuição igualitária) e idade (p=0,949). Resultados Após 4 anos, os indivíduos com MG apresentaram estabilidade clínica, aumento do diagnóstico de disartria leve e moderada (de 40% para 90% dos sujeitos) e diminuição significativa no desempenho dos subsistemas da fala: respiração, fonação e articulação. Na análise acústica, houve declínio do padrão articulatório (taxa de fala p=0,047, taxa de articulação p=0,007, duração média das sílabas p=0,007) e qualidade vocal (jitter aumentado p=0,022). Houve diferença significativa nas variáveis fonatórias (shimmer e harmonic-to-noise ratio) entre os grupos MG e GC na comparação do seguimento. Conclusão Indivíduos com MG apresentaram declínio no padrão de fala em um intervalo de quatro anos, com aumento no número de disártricos (leve e moderado). Mesmo com a estabilidade da doença, houve piora dos subsistemas respiratório, fonatório e articulatório. Não houve correlação entre o padrão de fala e as características clínicas da doença (gravidade e escala motora).

12.

Evidência de validade baseada nos processos de resposta de um protocolo de análise espectrográfica da voz / Evidence of response process validity of a spectrographic voice analysis protocol

Silva, Allan Carlos França da; Diniz, Emmanuel Gustavo Rodrigues; Paiva, Maxsuel Avelino Alves de; Silva, Itacely Marinho da; Silva, Saulo Iordan do Nascimento; Lima Filho, Luiz Medeiros Araujo; Lopes, Leonardo Wanderley.

Audiol., Commun. res ; 29: e2826, 2024. tab, graf

Artículo en Portugués | LILACS | ID: biblio-1550051

RESUMEN

RESUMO Objetivo desenvolver a etapa de validade baseada nos processos de resposta do Protocolo de Análise Espectrográfica da Voz (PAEV). Métodos foram recrutados dez fonoaudiólogos e dez alunos de graduação em Fonoaudiologia, que aplicaram o PAEV em dez espectrogramas, realizaram o julgamento dos itens do PAEV e participaram de uma entrevista cognitiva. A partir das respostas, o PAEV foi reanalisado para reformulação ou para exclusão de itens. Utilizou-se o teste Qui-Quadrado e os valores de acurácia para análise das respostas dos questionários, assim como análise qualitativa dos dados da entrevista cognitiva. Resultados os participantes obtiveram acurácia maior que 70% na maioria dos itens do PAE. Apenas sete itens alcançaram acurácia menor ou igual a 70%. Houve diferença entre as respostas de presença versus ausência de dificuldade na identificação dos itens no espectrograma. A maioria dos participantes não teve dificuldade na identificação dos itens do PAEV. Na entrevista cognitiva, apenas seis itens não obtiveram correta identificação da intenção, conforme verificado na análise qualitativa. Além disso, os participantes sugeriram exclusão de cinco itens. Conclusão após a etapa de validação baseada nos processos de resposta, o PAEV foi reformulado. Sete itens foram excluídos e dois itens foram reformulados. Dessa forma, a versão final do PAEV após essa etapa foi reduzida de 25 para 18 itens, distribuídos nos cinco domínios.

ABSTRACT Purpose To develop the validity step based on the response processes of the Spectrographic Analysis Protocol (SAP). Methods 10 speech therapists and 10 undergraduate students of the Speech Therapy course were recruited, who applied the SAP in 10 spectrograms, performed the evaluation of the PAE items, and participated in a cognitive interview (CI). The SAP was reanalyzed to reformulate or exclude items based on the responses. The chi-square test and the accuracy values were used to analyze the answers to the questionnaires and qualitative analysis of the CI data. Results the participants achieved accuracy > 70% in most items of the SAP. Only seven items achieved accuracy ≤ 70%. There was a difference between presence vs. absence of difficulty in identifying items in the spectrogram. Most participants had no problem identifying the SAP items. In the CI, only six items did not correctly identify the intention, verified in the qualitative analysis. In addition, participants suggested excluding five items. Conclusion After the validation step based on the response processes, the SAP is reformulated. Seven items were deleted, and two items were reformulated. Thus, the final version of the SAP after this stage was reduced from 25 to 18 items, distributed in the five domains.

Asunto(s)

Humanos , Espectrografía del Sonido/métodos , Acústica del Lenguaje , Calidad de la Voz , Trastornos de la Voz/diagnóstico por imagen

13.

Speech Perception as a Function of the Number of Channels and Channel Interaction in Cochlear Implant Simulation.

Yuksel, Mustafa; Kaya, Sultan Nur.

Medeni Med J ; 38(4): 276-283, 2023 Dec 26.

Artículo en Inglés | MEDLINE | ID: mdl-38148725

RESUMEN

Objective: Speech perception relies on precise spectral and temporal cues. However, cochlear implant (CI) processing is confined to a limited frequency range, affecting the information transmitted to the auditory system. This study analyzes the influence of channel interaction and the number of channels on word recognition scores (WRS) within the CI simulation framework. Methods: Two distinct experiments were conducted. The first experiment (n=29, average age =23 years, 14 females) evaluated the number of channels using eight, twelve, sixteen, and 22 channel vocoded and nonvocoded word lists for WRS assessment. The second experiment (n=29, average age =25 years, 16 females) explored channel interaction across low, middle, and high-interaction conditions. Results: In the first experiment, participants scored 57.93%, 80.97%, 83.59%, 91.03%, and 95.45% under 8, 12, 16, and 22-channel vocoder and non-vocoder conditions, respectively. The number of vocoder channels significantly affected WRS, with significant differences observed in all conditions except between the 12-channel and 16-channels (p<0.01). In the second experiment, the participants scored 2.2%, 20.6%, and 50.6% under high, mid, and low interaction conditions, respectively. Statistically significant differences were observed across all channel interaction conditions (p<0.01). Conclusions: While the number of channels had a notable impact on WRS, it is essential to note that certain conditions (12 vs. 16) did not yield statistically significant differences. The observed differences in WRS were eclipsed by the pronounced effects of channel interaction. Notably, all conditions in the channel interaction experiment exhibited statistically significant differences. These findings underscore the paramount importance of prioritizing channel interaction in signal processing and CI fitting.

14.

Deep Learning of Speech Data for Early Detection of Alzheimer's Disease in the Elderly.

Ahn, Kichan; Cho, Minwoo; Kim, Suk Wha; Lee, Kyu Eun; Song, Yoojin; Yoo, Seok; Jeon, So Yeon; Kim, Jeong Lan; Yoon, Dae Hyun; Kong, Hyoun-Joong.

Bioengineering (Basel) ; 10(9)2023 Sep 18.

Artículo en Inglés | MEDLINE | ID: mdl-37760195

RESUMEN

BACKGROUND: Alzheimer's disease (AD) is the most common form of dementia, which makes the lives of patients and their families difficult for various reasons. Therefore, early detection of AD is crucial to alleviating the symptoms through medication and treatment. OBJECTIVE: Given that AD strongly induces language disorders, this study aims to detect AD rapidly by analyzing the language characteristics. MATERIALS AND METHODS: The mini-mental state examination for dementia screening (MMSE-DS), which is most commonly used in South Korean public health centers, is used to obtain negative answers based on the questionnaire. Among the acquired voices, significant questionnaires and answers are selected and converted into mel-frequency cepstral coefficient (MFCC)-based spectrogram images. After accumulating the significant answers, validated data augmentation was achieved using the Densenet121 model. Five deep learning models, Inception v3, VGG19, Xception, Resnet50, and Densenet121, were used to train and confirm the results. RESULTS: Considering the amount of data, the results of the five-fold cross-validation are more significant than those of the hold-out method. Densenet121 exhibits a sensitivity of 0.9550, a specificity of 0.8333, and an accuracy of 0.9000 in a five-fold cross-validation to separate AD patients from the control group. CONCLUSIONS: The potential for remote health care can be increased by simplifying the AD screening process. Furthermore, by facilitating remote health care, the proposed method can enhance the accessibility of AD screening and increase the rate of early AD detection.

15.

Harnessing acoustic speech parameters to decipher amyloid status in individuals with mild cognitive impairment.

García-Gutiérrez, Fernando; Marquié, Marta; Muñoz, Nathalia; Alegret, Montserrat; Cano, Amanda; de Rojas, Itziar; García-González, Pablo; Olivé, Clàudia; Puerta, Raquel; Orellana, Adelina; Montrreal, Laura; Pytel, Vanesa; Ricciardi, Mario; Zaldua, Carla; Gabirondo, Peru; Hinzen, Wolfram; Lleonart, Núria; García-Sánchez, Ainhoa; Tárraga, Lluís; Ruiz, Agustín; Boada, Mercè; Valero, Sergi.

Front Neurosci ; 17: 1221401, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37746151

RESUMEN

Alzheimer's disease (AD) is a neurodegenerative condition characterized by a gradual decline in cognitive functions. Currently, there are no effective treatments for AD, underscoring the importance of identifying individuals in the preclinical stages of mild cognitive impairment (MCI) to enable early interventions. Among the neuropathological events associated with the onset of the disease is the accumulation of amyloid protein in the brain, which correlates with decreased levels of Aß42 peptide in the cerebrospinal fluid (CSF). Consequently, the development of non-invasive, low-cost, and easy-to-administer proxies for detecting Aß42 positivity in CSF becomes particularly valuable. A promising approach to achieve this is spontaneous speech analysis, which combined with machine learning (ML) techniques, has proven highly useful in AD. In this study, we examined the relationship between amyloid status in CSF and acoustic features derived from the description of the Cookie Theft picture in MCI patients from a memory clinic. The cohort consisted of fifty-two patients with MCI (mean age 73 years, 65% female, and 57% positive amyloid status). Eighty-eight acoustic parameters were extracted from voice recordings using the extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS), and several ML models were used to classify the amyloid status. Furthermore, interpretability techniques were employed to examine the influence of input variables on the determination of amyloid-positive status. The best model, based on acoustic variables, achieved an accuracy of 75% with an area under the curve (AUC) of 0.79 in the prediction of amyloid status evaluated by bootstrapping and Leave-One-Out Cross Validation (LOOCV), outperforming conventional neuropsychological tests (AUC = 0.66). Our results showed that the automated analysis of voice recordings derived from spontaneous speech tests offers valuable insights into AD biomarkers during the preclinical stages. These findings introduce novel possibilities for the use of digital biomarkers to identify subjects at high risk of developing AD.

16.

Effect of clear speech on acoustic measures of dysprosody in Parkinson disease for different reading tasks.

Coy, Anna G; Whitfield, Jason A.

Int J Speech Lang Pathol ; : 1-14, 2023 Sep 05.

Artículo en Inglés | MEDLINE | ID: mdl-37668056

RESUMEN

PURPOSE: The purpose of the study was to determine the effect of clear speech instruction on acoustic measures of dysprosody between reading passages of differing linguistic content for speakers with and without Parkinson Disease (PD). METHOD: Ten speakers with PD and 10 controls served as participants and read five simple and three standard reading stimuli twice. First, speakers read habitually and then following clear speech instruction. Acoustic measures of fundamental frequency variation (semitone standard deviation, STSD), articulation rate, and between-complex pause durations were calculated. RESULT: Results indicated speakers with PD exhibited less fundamental frequency variation than controls across reading stimuli and instructions. All speakers exhibited lower STSD and longer between-complex pause durations for the standard compared to simple reading stimuli. For clear speech, all speakers reduced articulation rate and increased between-complex pause durations in both simple and standard reading stimuli. However, speakers with PD exhibited a significantly less robust reduction in articulation rate for clear speech than control speakers for all reading stimuli. CONCLUSION: Linguistic content of reading stimuli contributes to differences in fundamental frequency variation and pause duration for all speakers. All speakers reduced articulation rate for clear speech compared to habitual instruction, but speakers with PD did so to a lesser extent than controls. Linguistic content of reading stimuli to examine dysprosody in PD should be considered for clinical application.

17.

Articulatory speech measures can be related to the severity of multiple sclerosis.

Kieling, Maiara Laís Mallmann; Finkelsztejn, Alessandro; Konzen, Viviana Regina; Dos Santos, Vanessa Brzoskowski; Ayres, Annelise; Klein, Iasmin; Rothe-Neves, Rui; Olchik, Maira Rozenfeld.

Front Neurol ; 14: 1075736, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37384284

RESUMEN

Background: Dysarthria is one of the most frequent communication disorders in patients with Multiple Sclerosis (MS), with an estimated prevalence of around 50%. However, it is unclear if there is a relationship between dysarthria and the severity or duration of the disease. Objective: Describe the speech pattern in MS, correlate with clinical data, and compare with controls. Methods: A group of MS patients (n = 73) matched to healthy controls (n = 37) by sex and age. Individuals with neurological and/or systemic conditions that could interfere with speech were excluded. MS group clinical data were obtained through the analysis of medical records. The speech assessment consisted of auditory-perceptual and speech acoustic analysis, from recording the following speech tasks: phonation and breathing (sustained vowel/a/); prosody (sentences with different intonation patterns) and articulation (diadochokinesis; spontaneous speech; diphthong/iu/repeatedly). Results: In MS, 72.6% of the individuals presented mild dysarthria, with alterations in speech subsystems: phonation, breathing, resonance, and articulation. In the acoustic analysis, individuals with MS were significantly worse than the control group (CG) in the variables: standard deviation of the fundamental frequency (p = 0.001) and maximum phonation time (p = 0.041). In diadochokinesis, individuals with MS had a lower number of syllables, duration, and phonation time, but larger pauses per seconds, and in spontaneous speech, a high number of pauses were evidenced as compared to CG. Correlations were found between phonation time in spontaneous speech and the Expanded Disability Status Scale (EDSS) (r = - 0.238, p = 0.043) and phonation ratio in spontaneous speech and EDSS (r = -0.265, p = 0.023), which indicates a correlation between the number of pauses during spontaneous speech and the severity of the disease. Conclusion: The speech profile in MS patients was mild dysarthria, with a decline in the phonatory, respiratory, resonant, and articulatory subsystems of speech, respectively, in order of prevalence. The increased number of pauses during speech and lower rates of phonation ratio can reflect the severity of MS.

18.

The efficacy of memory load on speech-based detection of Alzheimer's disease.

Bae, Minju; Seo, Myo-Gyeong; Ko, Hyunwoong; Ham, Hyunsun; Kim, Keun You; Lee, Jun-Young.

Front Aging Neurosci ; 15: 1186786, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37333455

RESUMEN

Introduction: The study aims to test whether an increase in memory load could improve the efficacy in detection of Alzheimer's disease and prediction of the Mini-Mental State Examination (MMSE) score. Methods: Speech from 45 mild-to-moderate Alzheimer's disease patients and 44 healthy older adults were collected using three speech tasks with varying memory loads. We investigated and compared speech characteristics of Alzheimer's disease across speech tasks to examine the effect of memory load on speech characteristics. Finally, we built Alzheimer's disease classification models and MMSE prediction models to assess the diagnostic value of speech tasks. Results: The speech characteristics of Alzheimer's disease in pitch, loudness, and speech rate were observed and the high-memory-load task intensified such characteristics. The high-memory-load task outperformed in AD classification with an accuracy of 81.4% and MMSE prediction with a mean absolute error of 4.62. Discussion: The high-memory-load recall task is an effective method for speech-based Alzheimer's disease detection.

19.

Speech disorders in patients with Tongue squamous cell carcinoma: A longitudinal observational study based on a questionnaire and acoustic analysis.

Guo, Kaixin; Xiao, Yudong; Deng, Wei; Zhao, Guiyi; Zhang, Jie; Liang, Yujie; Yang, Le; Liao, Guiqing.

BMC Oral Health ; 23(1): 192, 2023 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-37005608

RESUMEN

BACKGROUND: Speech disorders are common dysfunctions in patients with tongue squamous cell carcinoma (TSCC) that can diminish their quality of life. There are few studies with multidimensional and longitudinal assessments of speech function in TSCC patients. METHODS: This longitudinal observational study was conducted at the Hospital of Stomatology, Sun Yat-sen University, China, from January 2018 to March 2021. A cohort of 92 patients (53 males, age range: 24-77 years) diagnosed with TSCC participated in this study. Speech function was assessed from preoperatively to one year postoperatively using the Speech Handicap Index questionnaire and acoustic parameters. The risk factors for postoperative speech disorder were analyzed by a linear mixed-effects model. A t test or MannâWhitney U test was applied to analyze the differences in acoustic parameters under the influence of risk factors to determine the pathophysiological mechanisms of speech disorders in patients with TSCC. RESULTS: The incidence of preoperative speech disorders was 58.7%, which increased up to 91.4% after surgery. Higher T stage (Pï¼0.001) and larger range of tongue resection (P = 0.002) were risk factors for postoperative speech disorders. Among the acoustic parameters, F2/i/decreased remarkably with higher T stage (P = 0.021) and larger range of tongue resection (P = 0.009), indicating restricted tongue movement in the anterior-posterior direction. The acoustic parameters analysis during the follow-up period showed that F1 and F2 were not significantly different of the patients with subtotal or total glossectomy over time. CONCLUSIONS: Speech disorders in TSCC patients is common and persistent. Less residual tongue volume led to worse speech-related QoL, indicating that surgically restoring the length of the tongue and strengthening tongue extension postoperatively may be important.

Asunto(s)

Carcinoma de Células Escamosas , Neoplasias de la Lengua , Masculino , Humanos , Adulto Joven , Adulto , Persona de Mediana Edad , Anciano , Neoplasias de la Lengua/complicaciones , Neoplasias de la Lengua/cirugía , Neoplasias de la Lengua/patología , Carcinoma de Células Escamosas/complicaciones , Carcinoma de Células Escamosas/cirugía , Carcinoma de Células Escamosas/patología , Calidad de Vida , Lengua , Trastornos del Habla/etiología , Acústica

20.

Staffs' physiological responses to irrelevant background speech and mental workload in open-plan bank office workspaces.

Golmohammadi, Rostam; Motlagh, Masoud Shafiee; Aliabadi, Mohsen; Faradmal, Javad; Ranjbar, Akram.

Work ; 76(2): 623-636, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-36938764

RESUMEN

BACKGROUND: Acoustic comfort is one of the most critical challenges in the open-plan workspace. OBJECTIVE: This study was aimed to assess the effect of irrelevant background speech (IBS) and mental workload (MWL) on staffs' physiological parameters in open-plan bank office workspaces. METHODS: In this study, 109 male cashier staff of the banks were randomly selected. The 30-minute equivalent noise level (LAeq) of the participants was measured in three intervals at the beginning (section A), middle (section B), and end of working hours (section C). The heart rate (HR) and heart rate variability (HRV): low frequency (LF), high frequency (HF), and LF/HF of the staff were also recorded in sections A, B, and C. Moreover, staff was asked to rate the MWL using the NASA-Task load. RESULTS: The dominant frequency of the LAeq was 500âHz, and the LAeq in the frequency range of 250 to 2000 was higher than other frequencies. The LAeq (500âHz) was 55.82, 69.35, and 69.64âdB(A) in sections A, B, and C, respectively. The results show that the IBS affects staffs' physiological responses so that with increasing in IBS, the HF power decreases. Moreover, with higher MWL, increasing noise exposure, especially IBS, causes more increases in LF power and LF/HF ratio. CONCLUSION: It seems that the IBS can affect physiological responses and increase staff stress in open-plan bank office workspaces. Moreover, the mental workload can intensify these consequences in these working settings.

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA