RESUMO
INTRODUCTION: Screening for Alzheimer's disease neuropathologic change (ADNC) in individuals with atypical presentations is challenging but essential for clinical management. We trained automatic speech-based classifiers to distinguish frontotemporal dementia (FTD) patients with ADNC from those with frontotemporal lobar degeneration (FTLD). METHODS: We trained automatic classifiers with 99 speech features from 1 minute speech samples of 179 participants (ADNC = 36, FTLD = 60, healthy controls [HC] = 89). Patients' pathology was assigned based on autopsy or cerebrospinal fluid analytes. Structural network-based magnetic resonance imaging analyses identified anatomical correlates of distinct speech features. RESULTS: Our classifier showed 0.88 ± $ \pm $ 0.03 area under the curve (AUC) for ADNC versus FTLD and 0.93 ± $ \pm $ 0.04 AUC for patients versus HC. Noun frequency and pause rate correlated with gray matter volume loss in the limbic and salience networks, respectively. DISCUSSION: Brief naturalistic speech samples can be used for screening FTD patients for underlying ADNC in vivo. This work supports the future development of digital assessment tools for FTD. HIGHLIGHTS: We trained machine learning classifiers for frontotemporal dementia patients using natural speech. We grouped participants by neuropathological diagnosis (autopsy) or cerebrospinal fluid biomarkers. Classifiers well distinguished underlying pathology (Alzheimer's disease vs. frontotemporal lobar degeneration) in patients. We identified important features through an explainable artificial intelligence approach. This work lays the groundwork for a speech-based neuropathology screening tool.
Assuntos
Doença de Alzheimer , Demência Frontotemporal , Imageamento por Ressonância Magnética , Fala , Humanos , Feminino , Doença de Alzheimer/patologia , Masculino , Idoso , Demência Frontotemporal/patologia , Fala/fisiologia , Pessoa de Meia-Idade , Fenótipo , Degeneração Lobar Frontotemporal/patologia , Aprendizado de MáquinaRESUMO
BACKGROUND: Delirium is a critically underdiagnosed syndrome of altered mental status affecting more than 50% of older adults admitted to hospital. Few studies have incorporated speech and language disturbance in delirium detection. We sought to describe speech and language disturbances in delirium, and provide a proof of concept for detecting delirium using computational speech and language features. METHODS: Participants underwent delirium assessment and completed language tasks. Speech and language disturbances were rated using standardized clinical scales. Recordings and transcripts were processed using an automated pipeline to extract acoustic and textual features. We used binomial, elastic net, machine learning models to predict delirium status. RESULTS: We included 33 older adults admitted to hospital, of whom 10 met criteria for delirium. The group with delirium scored higher on total language disturbances and incoherence, and lower on category fluency. Both groups scored lower on category fluency than the normative population. Cognitive dysfunction as a continuous measure was correlated with higher total language disturbance, incoherence, loss of goal and lower category fluency. Including computational language features in the model predicting delirium status increased accuracy to 78%. LIMITATIONS: This was a proof-of-concept study with limited sample size, without a set-aside cross-validation sample. Subsequent studies are needed before establishing a generalizable model for detecting delirium. CONCLUSION: Language impairments were elevated among patients with delirium and may also be used to identify subthreshold cognitive disturbances. Computational speech and language features are promising as accurate, noninvasive and efficient biomarkers of delirium.
Assuntos
Disfunção Cognitiva , Delírio , Humanos , Idoso , Fala , Idioma , Disfunção Cognitiva/diagnóstico , Delírio/diagnósticoRESUMO
PURPOSE: Multiple methods have been suggested for quantifying syntactic complexity in speech. We compared eight automated syntactic complexity metrics to determine which best captured verified syntactic differences between old and young adults. METHOD: We used natural speech samples produced in a picture description task by younger (n = 76, ages 18-22 years) and older (n = 36, ages 53-89 years) healthy participants, manually transcribed and segmented into sentences. We manually verified that older participants produced fewer complex structures. We developed a metric of syntactic complexity using automatically extracted syntactic structures as features in a multidimensional metric. We compared our metric to seven other metrics: Yngve score, Frazier score, Frazier-Roark score, developmental level, syntactic frequency, mean dependency distance, and sentence length. We examined the success of each metric in identifying the age group using logistic regression models. We repeated the analysis with automatic transcription and segmentation using an automatic speech recognition (ASR) system. RESULTS: Our multidimensional metric was successful in predicting age group (area under the curve [AUC] = 0.87), and it performed better than the other metrics. High AUCs were also achieved by the Yngve score (0.84) and sentence length (0.84). However, in a fully automated pipeline with ASR, the performance of these two metrics dropped (to 0.73 and 0.46, respectively), while the performance of the multidimensional metric remained relatively high (0.81). CONCLUSIONS: Syntactic complexity in spontaneous speech can be quantified by directly assessing syntactic structures and considering them in a multivariable manner. It can be derived automatically, saving considerable time and effort compared to manually analyzing large-scale corpora, while maintaining high face validity and robustness. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.24964179.
Assuntos
Percepção da Fala , Fala , Adulto Jovem , Humanos , Área Sob a CurvaRESUMO
OBJECTIVE: To evaluate automated digital speech measures, derived from spontaneous speech (picture descriptions), in assessing bulbar motor impairments in patients with ALS-FTD spectrum disorders (ALS-FTSD). METHODS: Automated vowel algorithms were employed to extract two vowel acoustic measures: vowel space area (VSA), and mean second formant slope (F2 slope). Vowel measures were compared between ALS with and without clinical bulbar symptoms (ALS + bulbar (n = 49, ALSFRS-r bulbar subscore: x¯ = 9.8 (SD = 1.7)) vs. ALS-nonbulbar (n = 23), behavioral variant frontotemporal dementia (bvFTD, n = 25) without a motor syndrome, and healthy controls (HC, n = 32). Correlations with bulbar motor clinical scales, perceived listener effort, and MRI cortical thickness of the orobuccal primary motor cortex (oral PMC) were examined. We compared vowel measures to speaking rate, a conventional metric for assessing bulbar dysfunction. RESULTS: ALS + bulbar had significantly reduced VSA and F2 slope than ALS-nonbulbar (|d|=0.94 and |d|=1.04, respectively), bvFTD (|d|=0.89 and |d|=1.47), and HC (|d|=0.73 and |d|=0.99). These reductions correlated with worse bulbar clinical scores (VSA: R = 0.33, p = 0.043; F2 slope: R = 0.38, p = 0.011), greater listener effort (VSA: R=-0.43, p = 0.041; F2 slope: p > 0.05), and cortical thinning in oral PMC (F2 slope: ß = 0.0026, p = 0.017). Vowel measures demonstrated greater sensitivity and specificity for bulbar impairment than speaking rate, while showing independence from cognitive and respiratory impairments. CONCLUSION: Automatic vowel measures are easily derived from a brief spontaneous speech sample, are sensitive to mild-moderate stage of bulbar disease in ALS-FTSD, and may present better sensitivity to bulbar impairment compared to traditional assessments such as speaking rate.
Assuntos
Esclerose Lateral Amiotrófica , Distúrbios Distônicos , Demência Frontotemporal , Humanos , Demência Frontotemporal/diagnóstico , Demência Frontotemporal/diagnóstico por imagem , Esclerose Lateral Amiotrófica/complicações , Esclerose Lateral Amiotrófica/diagnóstico , Fala , Imageamento por Ressonância MagnéticaRESUMO
Introduction: Depression and its components significantly impact dementia prediction and severity, necessitating reliable objective measures for quantification. Methods: We investigated associations between emotion-based speech measures (valence, arousal, and dominance) during picture descriptions and depression dimensions derived from the geriatric depression scale (GDS, dysphoria, withdrawal-apathy-vigor (WAV), anxiety, hopelessness, and subjective memory complaint). Results: Higher WAV was associated with more negative valence (estimate = -0.133, p = 0.030). While interactions of apolipoprotein E (APOE) 4 status with depression dimensions on emotional valence did not reach significance, there was a trend for more negative valence with higher dysphoria in those with at least one APOE4 allele (estimate = -0.404, p = 0.0846). Associations were similar irrespective of dementia severity. Discussion: Our study underscores the potential utility of speech biomarkers in characterizing depression dimensions. In future research, using emotionally charged stimuli may enhance emotional measure elicitation. The role of APOE on the interaction of speech markers and depression dimensions warrants further exploration with greater sample sizes. Highlights: Participants reporting higher apathy used more negative words to describe a neutral picture.Those with higher dysphoria and at least one APOE4 allele also tended to use more negative words.Our results suggest the potential use of speech biomarkers in characterizing depression dimensions.
RESUMO
BACKGROUND AND OBJECTIVES: Clinical trials developing therapeutics for frontotemporal degeneration (FTD) focus on pathogenic variant carriers at preclinical stages. Objective, quantitative clinical assessment tools are needed to track stability and delayed disease onset. Natural speech can serve as an accessible, cost-effective assessment tool. We aimed to identify early changes in the natural speech of FTD pathogenic variant carriers before they become symptomatic. METHODS: In this cohort study, speech samples of picture descriptions were collected longitudinally from healthy participants in observational studies at the University of Pennsylvania and Columbia University between 2007 and 2020. Participants were asymptomatic but at risk for familial FTD. Status as "carrier" or "noncarrier" was based on screening for known pathogenic variants in the participant's family. Thirty previously validated digital speech measures derived from automatic speech processing pipelines were selected a priori based on previous studies in patients with FTD and compared between asymptomatic carriers and noncarriers cross-sectionally and longitudinally. RESULTS: A total of 105 participants, all asymptomatic, included 41 carriers: 12 men [30%], mean age 43 ± 13 years; education, 16 ± 2 years; MMSE 29 ± 1; and 64 noncarriers: 27 men [42%]; mean age, 48 ± 14 years; education, 15 ± 3 years; MMSE 29 ± 1. We identified 4 speech measures that differed between carriers and noncarriers at baseline: mean speech segment duration (mean difference -0.28 seconds, 95% CI -0.55 to -0.02, p = 0.04); word frequency (mean difference 0.07, 95% CI 0.008-0.14, p = 0.03); word ambiguity (mean difference 0.02, 95% CI 0.0008-0.05, p = 0.04); and interjection count per 100 words (mean difference 0.33, 95% CI 0.07-0.59, p = 0.01). Three speech measures deteriorated over time in carriers only: particle count per 100 words per month (ß = -0.02, 95% CI -0.03 to -0.004, p = 0.009); total narrative production time in seconds per month (ß = -0.24, 95% CI -0.37 to -0.12, p < 0.001); and total number of words per month (ß = -0.48, 95% CI -0.78 to -0.19, p = 0.002) including in 3 carriers who later converted to symptomatic disease. DISCUSSION: Using automatic processing pipelines, we identified early changes in the natural speech of FTD pathogenic variant carriers in the presymptomatic stage. These findings highlight the potential utility of natural speech as a digital clinical outcome assessment tool in FTD, where objective and quantifiable measures for abnormal behavior and language are lacking.
Assuntos
Demência Frontotemporal , Adulto , Humanos , Masculino , Pessoa de Meia-Idade , Atrofia , Estudos de Coortes , Escolaridade , Demência Frontotemporal/genética , Fala , Feminino , Estudos Observacionais como AssuntoRESUMO
Hybridization probes have been used in the detection of specific nucleic acids for the last 50 years. Despite the extensive efforts and the great significance, the challenges of the commonly used probes include (1) low selectivity in detecting single nucleotide variations (SNV) at low (e.g. room or 37 °C) temperatures; (2) low affinity in binding folded nucleic acids, and (3) the cost of fluorescent probes. Here we introduce a multicomponent hybridization probe, called OWL2 sensor, which addresses all three issues. The OWL2 sensor uses two analyte binding arms to tightly bind and unwind folded analytes, and two sequence-specific strands that bind both the analyte and a universal molecular beacon (UMB) probe to form fluorescent 'OWL' structure. The OWL2 sensor was able to differentiate single base mismatches in folded analytes in the temperature range of 5-38 °C. The design is cost-efficient since the same UMB probe can be used for detecting any analyte sequence.
Assuntos
Nanoestruturas , Ácidos Nucleicos , Hibridização de Ácido Nucleico , Nucleotídeos , Sondas de Oligonucleotídeos/químicaRESUMO
Introduction: Category and letter fluency tasks are commonly used neuropsychological tasks to evaluate lexical retrieval. Methods: This study used validated automated methods, which allow for more expansive investigation, to analyze speech production of both category ("Animal") and letter ("F") fluency tasks produced by healthy participants (n = 36) on an online platform. Recordings were transcribed and analyzed through automated pipelines, which utilized natural language processing and automatic acoustic processing tools. Automated pipelines calculated overall performance scores, mean inter-word response time, and word start time; errors were excluded from analysis. Each word was rated for age of acquisition (AoA), ambiguity, concreteness, frequency, familiarity, word length, word duration, and phonetic and semantic distance from its previous word. Results: Participants produced significantly more words on the category fluency task relative to the letter fluency task (p < 0.001), which is in line with previous studies. Wilcoxon tests also showed tasks differed on several mean speech measures of words, and category fluency was associated with lower mean AoA (p<0.001), lower frequency (p < 0.001), lower semantic ambiguity (p < 0.001), lower semantic distance (p < 0.001), lower mean inter-word RT (p = 0.03), higher concreteness (p < 0.001), and higher familiarity (p = 0.02), compared to letter fluency. ANOVAs significant interactions for fluency task on total score and lexical measures showed that lower category fluency scores were significantly related to lower AoA and higher prevalence, and this was not observed for letter fluency scores. Finally, word-characteristics changed over time and significant interactions were noted between the tasks, including word familiarity (p = 0.019), semantic ambiguity (p = 0.002), semantic distance (p=0.001), and word duration (p<0.001). Discussion: These findings showed that certain lexical measures such as AoA, word familiarity, and semantic ambiguity were important for understanding how these tasks differ. Additionally, it found that acoustic measures such as inter-word RT and word duration are also imperative to analyze when comparing the two tasks. By implementing these automated techniques, which are reproducible and scalable, to analyze fluency tasks we were able to quickly detect these differences. In future clinical settings, we expect these methods to expand our knowledge on speech feature differences that impact not only total scores, but many other speech measures among clinical populations.
RESUMO
BACKGROUND AND HYPOTHESIS: Quantitative acoustic and textual measures derived from speech ("speech features") may provide valuable biomarkers for psychiatric disorders, particularly schizophrenia spectrum disorders (SSD). We sought to identify cross-diagnostic latent factors for speech disturbance with relevance for SSD and computational modeling. STUDY DESIGN: Clinical ratings for speech disturbance were generated across 14 items for a cross-diagnostic sample (N = 334), including SSD (n = 90). Speech features were quantified using an automated pipeline for brief recorded samples of free speech. Factor models for the clinical ratings were generated using exploratory factor analysis, then tested with confirmatory factor analysis in the cross-diagnostic and SSD groups. The relationships between factor scores and computational speech features were examined for 202 of the participants. STUDY RESULTS: We found a 3-factor model with a good fit in the cross-diagnostic group and an acceptable fit for the SSD subsample. The model identifies an impaired expressivity factor and 2 interrelated disorganized factors for inefficient and incoherent speech. Incoherent speech was specific to psychosis groups, while inefficient speech and impaired expressivity showed intermediate effects in people with nonpsychotic disorders. Each of the 3 factors had significant and distinct relationships with speech features, which differed for the cross-diagnostic vs SSD groups. CONCLUSIONS: We report a cross-diagnostic 3-factor model for speech disturbance which is supported by good statistical measures, intuitive, applicable to SSD, and relatable to linguistic theories. It provides a valuable framework for understanding speech disturbance and appropriate targets for modeling with quantitative speech features.
Assuntos
Transtornos Psicóticos , Esquizofrenia , Humanos , Fala , Idioma , Esquizofrenia/complicações , Transtornos Psicóticos/complicações , Análise FatorialRESUMO
BACKGROUND: Autistic girls are underdiagnosed compared to autistic boys, even when they experience similar clinical impact. Research suggests that girls present with distinct symptom profiles across a variety of domains, such as language, which may contribute to their underdiagnosis. In this study, we examine sex differences in the temporal dynamics of natural conversations between naïve adult confederates and school-aged children with or without autism, with the goal of improving our understanding of conversational behavior in autistic girls and ultimately improving identification. METHODS: Forty-five school-aged children with autism (29 boys and 16 girls) and 47 non-autistic/neurotypical (NT) children (23 boys and 24 girls) engaged in a 5-min "get-to-know-you" conversation with a young adult confederate that was unaware of children's diagnostic status. Groups were matched on IQ estimates. Recordings were time-aligned and orthographically transcribed by trained annotators. Several speech and pause measures were calculated. Groups were compared using analysis of covariance models, controlling for age. RESULTS: Autistic girls used significantly more words than autistic boys, and produced longer speech segments than all other groups. Autistic boys spoke more slowly than NT children, whereas autistic girls did not differ from NT children in total word counts or speaking rate. Autistic boys interrupted confederates' speech less often and produced longer between-turn pauses (i.e., responded more slowly when it was their turn) compared to other children. Within-turn pause duration did not differ by group. LIMITATIONS: Our sample included verbally fluent children and adolescents aged 6-15 years, so our study results may not replicate in samples of younger children, adults, and individuals who are not verbally fluent. The results of this relatively small study, while compelling, should be interpreted with caution and replicated in a larger sample. CONCLUSION: This study investigated the temporal dynamics of everyday conversations and demonstrated that autistic girls and boys have distinct natural language profiles. Specifying differences in verbal communication lays the groundwork for the development of sensitive screening and diagnostic tools to more accurately identify autistic girls, and could inform future personalized interventions that improve short- and long-term social communication outcomes for all autistic children.
Assuntos
Transtorno Autístico , Adolescente , Humanos , Criança , Masculino , Feminino , Transtorno Autístico/diagnóstico , Caracteres Sexuais , Comunicação , Idioma , FalaRESUMO
Background and objectives: Patients with ALS-FTD spectrum disorders (ALS-FTSD) have mixed motor and cognitive impairments and require valid and quantitative assessment tools to support diagnosis and tracking of bulbar motor disease. This study aimed to validate a novel automated digital speech tool that analyzes vowel acoustics from natural, connected speech as a marker for impaired articulation due to bulbar motor disease in ALS-FTSD. Methods: We used an automatic algorithm called Forced Alignment Vowel Extraction (FAVE) to detect spoken vowels and extract vowel acoustics from 1 minute audio-recorded picture descriptions. Using automated acoustic analysis scripts, we derived two articulatory-acoustic measures: vowel space area (VSA, in Bark 2 ) which represents tongue range-of-motion (size), and average second formant slope of vowel trajectories (F2 slope) which represents tongue movement speed. We compared vowel measures between ALS with and without clinically-evident bulbar motor disease (ALS+bulbar vs. ALS-bulbar), behavioral variant frontotemporal dementia (bvFTD) without a motor syndrome, and healthy controls (HC). We correlated impaired vowel measures with bulbar disease severity, estimated by clinical bulbar scores and perceived listener effort, and with MRI cortical thickness of the orobuccal part of the primary motor cortex innervating the tongue (oralPMC). We also tested correlations with respiratory capacity and cognitive impairment. Results: Participants were 45 ALS+bulbar (30 males, mean age=61±11), 22 ALS-nonbulbar (11 males, age=62±10), 22 bvFTD (13 males, age=63±7), and 34 HC (14 males, age=69±8). ALS+bulbar had smaller VSA and shallower average F2 slopes than ALS-bulbar (VSA: | d |=0.86, p =0.0088; F2 slope: | d |=0.98, p =0.0054), bvFTD (VSA: | d |=0.67, p =0.043; F2 slope: | d |=1.4, p <0.001), and HC (VSA: | d |=0.73, p =0.024; F2 slope: | d |=1.0, p <0.001). Vowel measures declined with worsening bulbar clinical scores (VSA: R=0.33, p =0.033; F2 slope: R=0.25, p =0.048), and smaller VSA was associated with greater listener effort (R=-0.43, p =0.041). Shallower F2 slopes were related to cortical thinning in oralPMC (R=0.50, p =0.03). Neither vowel measure was associated with respiratory nor cognitive test scores. Conclusions: Vowel measures extracted with automatic processing from natural speech are sensitive to bulbar motor disease in ALS-FTD and are robust to cognitive impairment.
RESUMO
BACKGROUND AND OBJECTIVES: We compared digital speech and language features of patients with amnestic Alzheimer's disease (aAD) or logopenic variant primary progressive aphasia (lvPPA) in a biologically confirmed cohort and related these features to neuropsychiatric test scores and CSF analytes. METHODS: We included patients with aAD or lvPPA with cerebrospinal fluid (CSF) (phosphorylated Tau (p-Tau)/Aß≥ 0.09 and total Tau/Aß≥ 0.34) or autopsy confirmation of AD pathology and age-matched healthy controls (HC) recruited at the Frontotemporal Degeneration Center of the University of Pennsylvania for a cross-sectional study. We extracted speech and language variables with automated lexical and acoustic pipelines from participants' oral picture descriptions. We compared the groups and correlated distinct features with clinical ratings and CSF p-Tau levels. RESULTS: We examined patients with aAD (n=44; 62±8 years; 24 females; Mini-Mental State Exam (MMSE)=21.1±4.8) or lvPPA (n=21; 64.1±8.2 years; 11 females; MMSE=23.0±4.2), and healthy controls (HC) (n=28; 65.9±5.9 years, 15 females; MMSE=29±1). Patients with lvPPA produced fewer verbs (10.5±2.3; p=0.001), adjectives (2.7±1.3, p=0.019), and more fillers (7.4±3.9; p=0.022) with lower lexical diversity (0.84±0.1; p=0.05) and higher pause rate (54.2±19.2; p=0.015) than aAD (verbs: 12.5±2; adjectives: 3.8±2; fillers: 4.9±4.5; lexical diversity: 0.87±0.1; pause rate: 45.3±12.8). Both groups showed some shared language impairments compared with HC. Word frequency (MMSE: ß=-1.6, p=0.009, BNT: ß=-4.36, p<0.001), adverbs (MMSE: ß=-1.9, p=0.003, BNT: ß=-2.41, p=0.041), pause rate (MMSE: ß=-1.21, p=0.041, BNT: ß=-2.09, p=0.041), and word length (MMSE: ß=1.75, p=0.001, BNT: ß=2.94, p=0.003) were significantly correlated with both MMSE and BNT, but other measures were not correlated with MMSE and/or BNT. Prepositions (r=-0.36, p=0.019), nouns (r=-0.31, p=0.047), speech segment duration (r=-0.33, p=0.032), word frequency (r=0.33, p=0.036), and pause rate (r=0.34, p=0.026) were correlated with patients' CSF p-Tau levels. DISCUSSION: Our measures captured language and speech differences between the two phenotypes that traditional language-based clinical assessments failed to identify. This work demonstrates the potential of natural speech in reflecting underlying variants with AD pathology.
RESUMO
INTRODUCTION: An estimated 50% of patients with Lewy body dementias (LBD), including Parkinson's disease dementia (PDD) and Dementia with Lewy bodies (DLB), have co-occurring Alzheimer's disease (AD) that is associated with worse prognosis. This study tests an automated analysis of natural speech as an inexpensive, non-invasive screening tool for AD co-pathology in biologically-confirmed cohorts of LBD patients with AD co-pathology (SYN + AD) and without (SYN-AD). METHODS: We analyzed lexical-semantic and acoustic features of picture descriptions using automated methods in 22 SYN + AD and 38 SYN-AD patients stratified using AD CSF biomarkers or autopsy diagnosis. Speech markers of AD co-pathology were identified using best subset regression, and their diagnostic discrimination was tested using receiver operating characteristic. ANCOVAs compared measures between groups covarying for demographic differences and cognitive disease severity. We tested relations with CSF tau levels, and compared speech measures between PDD and DLB clinical disorders in the same cohort. RESULTS: Age of acquisition of nouns (p = 0.034, |d| = 0.77) and lexical density (p = 0.0064, |d| = 0.72) were reduced in SYN + AD, and together showed excellent discrimination for SYN + AD vs. SYN-AD (95% sensitivity, 66% specificity; AUC = 0.82). Lower lexical density was related to higher CSF t-Tau levels (R = -0.41, p = 0.0021). Clinically-diagnosed PDD vs. DLB did not differ on any speech features. CONCLUSION: AD co-pathology may result in a deviant natural speech profile in LBD characterized by specific lexical-semantic impairments, not detectable by clinical disorder diagnosis. Our study demonstrates the potential of automated digital speech analytics as a screening tool for underlying AD co-pathology in LBD.
Assuntos
Doença de Alzheimer , Demência , Doença por Corpos de Lewy , Doença de Parkinson , Doença de Alzheimer/patologia , Peptídeos beta-Amiloides , Biomarcadores , Demência/complicações , Humanos , Doença por Corpos de Lewy/complicações , Doença de Parkinson/psicologia , Fala , alfa-Sinucleína , Proteínas tauRESUMO
Prosody of patients with neurodegenerative disease is often impaired. We investigated changes to two prosodic cues in patients: the pitch contour and the duration of prepausal words. We analyzed recordings of picture descriptions produced by patients with neurodegenerative conditions that included either cognitive (n=223), motor (n=68), or mixed cognitive and motor impairments (n=109), and by healthy controls (n=28; HC). A speech activity detector identified pauses. Words were aligned to the acoustic signal; pitch values were normalized in scale and duration. Analyses of pitch showed that the ending (90th-100th percentile) of prepausal words had a lower pitch in the mixed and motor groups than the cognitive group and HC. The pitch contour from the midpoint of words to the end showed a steep rising slope for HC, but patients showed a gentle rising or flat slope. This suggests that HC signaled the continuation of their description after the pause with rising contour; patients either failed to keep describing the picture due to cognitive impairment or could not raise pitch due to motor impairments. Prepausal words showed longer duration relative to non-prepausal words with no significant differences between the groups. This suggests that prepausal lengthening is preserved in patients.
RESUMO
Graphical representations of speech generate powerful computational measures related to psychosis. Previous studies have mostly relied on structural relations between words as the basis of graph formation, i.e., connecting each word to the next in a sequence of words. Here, we introduced a method of graph formation grounded in semantic relationships by identifying elements that act upon each other (action relation) and the contents of those actions (predication relation). Speech from picture descriptions and open-ended narrative tasks were collected from a cross-diagnostic group of healthy volunteers and people with psychotic or non-psychotic disorders. Recordings were transcribed and underwent automated language processing, including semantic role labeling to identify action and predication relations. Structural and semantic graph features were computed using static and dynamic (moving-window) techniques. Compared to structural graphs, semantic graphs were more strongly correlated with dimensional psychosis symptoms. Dynamic features also outperformed static features, and samples from picture descriptions yielded larger effect sizes than narrative responses for psychosis diagnoses and symptom dimensions. Overall, semantic graphs captured unique and clinically meaningful information about psychosis and related symptom dimensions. These features, particularly when derived from semi-structured tasks using dynamic measurement, are meaningful additions to the repertoire of computational linguistic methods in psychiatry.
RESUMO
Over the decades, fashions in Computational Linguistics have changed again and again, with major shifts in motivations, methods and applications. When digital computers first appeared, linguistic analysis adopted the new methods of information theory, which accorded well with the ideas that dominated psychology and philosophy. Then came formal language theory and the idea of AI as applied logic, in sync with the development of cognitive science. That was followed by a revival of 1950s-style empiricism-AI as applied statistics-which in turn was followed by the age of deep nets. There are signs that the climate is changing again, and we offer some thoughts about paths forward, especially for younger researchers who will soon be the leaders.
RESUMO
We implemented an automated analysis of lexical aspects of semi-structured speech produced by healthy elderly controls (n = 37) and three patient groups with frontotemporal degeneration (FTD): behavioral variant FTD (n = 74), semantic variant primary progressive aphasia (svPPA, n = 42), and nonfluent/agrammatic PPA (naPPA, n = 22). Based on previous findings, we hypothesized that the three patient groups and controls would differ in the counts of part-of-speech (POS) categories and several lexical measures. With a natural language processing program, we automatically tagged POS categories of all words produced during a picture description task. We further counted the number of wh-words, and we rated nouns for abstractness, ambiguity, frequency, familiarity, and age of acquisition. We also computed the cross-entropy estimation, where low cross-entropy indicates high predictability, and lexical diversity for each description. We validated a subset of the POS data that were automatically tagged with the Google Universal POS scheme using gold-standard POS data tagged by a linguist, and we found that the POS categories from our automated methods were more than 90% accurate. For svPPA patients, we found fewer unique nouns than in naPPA and more pronouns and wh-words than in the other groups. We also found high abstractness, ambiguity, frequency, and familiarity for nouns and the lowest cross-entropy estimation among all groups. These measures were associated with cortical thinning in the left temporal lobe. In naPPA patients, we found increased speech errors and partial words compared to controls, and these impairments were associated with cortical thinning in the left middle frontal gyrus. bvFTD patients' adjective production was decreased compared to controls and was correlated with their apathy scores. Their adjective production was associated with cortical thinning in the dorsolateral frontal and orbitofrontal gyri. Our results demonstrate distinct language profiles in subgroups of FTD patients and validate our automated method of analyzing FTD patients' speech.
Assuntos
Afasia Primária Progressiva , Demência Frontotemporal , Idoso , Afasia Primária Progressiva/diagnóstico por imagem , Atrofia , Humanos , Idioma , Imageamento por Ressonância Magnética , Semântica , FalaRESUMO
Purpose This study examines the effect of age on language use with an automated analysis of digitized speech obtained from semistructured, narrative speech samples. Method We examined the Cookie Theft picture descriptions produced by 37 older and 76 young healthy participants. Using modern natural language processing and automatic speech recognition tools, we automatically annotated part-of-speech categories of all tokens, calculated the number of tense-inflected verbs, mean length of clause, and vocabulary diversity, and we rated nouns and verbs for five lexical features: word frequency, familiarity, concreteness, age of acquisition, and semantic ambiguity. We also segmented the speech signals into speech and silence and calculated acoustic features, such as total speech time, mean speech and pause segment durations, and pitch values. Results Older speakers produced significantly more fillers, pronouns, and verbs and fewer conjunctions, determiners, nouns, and prepositions than young participants. Older speakers' nouns and verbs were more familiar, more frequent (verbs only), and less ambiguous compared to those of young speakers. Older speakers produced shorter clauses with a lower vocabulary diversity than young participants. They also produced shorter speech segments and longer pauses with increased total speech time and total number of words. Lastly, we observed an interaction of age and sex in pitch ranges. Conclusions Our results suggest that older speakers' lexical content is less diverse, and these speakers produce shorter clauses than young participants in monologic, narrative speech. Our findings show that lexical and acoustic characteristics of semistructured speech samples can be examined with automated methods.
Assuntos
Fala , Vocabulário , Acústica , Adulto , Humanos , Idioma , SemânticaRESUMO
The letter-guided naming fluency task is a measure of an individual's executive function and working memory. This study employed a novel, automated, quantifiable, and reproducible method to investigate how language characteristics of words produced during a fluency task are related to fluency performance, inter-word response time (RT), and over task duration using digitized F-letter-guided fluency recordings produced by 76 young healthy participants. Our automated algorithm counted the number of correct responses from the transcripts of the F-letter fluency data, and individual words were rated for concreteness, ambiguity, frequency, familiarity, and age of acquisition (AoA). Using a forced aligner, the transcripts were automatically aligned with the corresponding audio recordings. We measured inter-word RT, word duration, and word start time from the forced alignments. Articulation rate was also computed. Phonetic and semantic distances between two consecutive F-letter words were measured. We found that total F-letter score was significantly correlated with the mean values of word frequency, familiarity, AoA, word duration, phonetic similarity, and articulation rate; total score was also correlated with an individual's standard deviation of AoA, familiarity, and phonetic similarity. RT was negatively correlated with frequency and ambiguity of F-letter words and was positively correlated with AoA, number of phonemes, and phonetic and semantic distances. Lastly, the frequency, ambiguity, AoA, number of phonemes, and semantic distance of words produced significantly changed over time during the task. The method employed in this paper demonstrates the successful implementation of our automated language processing pipelines in a standardized neuropsychological task. This novel approach captures subtle and rich language characteristics during test performance that enhance informativeness and cannot be extracted manually without massive effort. This work will serve as the reference for letter-guided category fluency production similarly acquired in neurodegenerative patients.
RESUMO
BACKGROUND: Progressive supranuclear palsy syndrome (PSPS) and corticobasal syndrome (CBS) as well as non-fluent/agrammatic primary progressive aphasia (naPPA) are often associated with misfolded 4-repeat tau pathology, but the diversity of the associated speech features is poorly understood. OBJECTIVE: Investigate the full range of acoustic and lexical properties of speech to test the hypothesis that PSPS-CBS show a subset of speech impairments found in naPPA. METHODS: Acoustic and lexical measures, extracted from natural, digitized semi-structured speech samples using novel, automated methods, were compared in PSPS-CBS (nâ=â87), naPPA (nâ=â25), and healthy controls (HC, nâ=â41). We related these measures to grammatical performance and speech fluency, core features of naPPA, to neuropsychological measures of naming, executive, memory and visuoconstructional functioning, and to cerebrospinal fluid (CSF) phosphorylated tau (pTau) levels in patients with available biofluid analytes. RESULTS: Both naPPA and PSPS-CBS speech produced shorter speech segments, longer pauses, higher pause rates, reduced fundamental frequency (f0) pitch ranges, and slower speech rate compared to HC. naPPA speech was distinct from PSPS-CBS with shorter speech segments, more frequent pauses, slower speech rate, reduced verb production, and higher partial word production. In both groups, acoustic duration measures generally correlated with speech fluency, measured as words per minute, and grammatical performance. Speech measures did not correlate with standard neuropsychological measures. CSF pTau levels correlated with f0 range in PSPS-CBS and naPPA. CONCLUSION: Lexical and acoustic speech features of PSPS-CBS overlaps those of naPPA and are related to CSF pTau levels.