Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 153
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Neurol Sci ; 2024 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-39453558

RESUMO

Progressive supranuclear palsy (PSP) is a neurodegenerative disease with pathological hallmarks and different clinical presentations. Recently, the Movement Disorder Society (MDS) promoted a new classification; specific combinations of the core clinical features identify different phenotypes, including PSP with Richardson's syndrome (PSP-RS) and PSP with predominant parkinsonism (PSP-P). Since speech disorders are very common in PSP, they were included in the MDS-PSP criteria as a supportive clinical feature in the form of hypokinetic, spastic dysarthria. However, little is known about how dysarthria presents across the different PSP variants. The aim of the present study is to evaluate the presence of differences in speech profile in a cohort of PSP-RS and PSP-P patients diagnosed according to the MDS-PSP criteria. Each patient underwent a neurological evaluation and perceptual and acoustic analysis of speech. Disease severity was rated using the Natural History and Neuroprotection in Parkinson plus syndromes-Parkinson plus scale (NNIPPS-PPS), including global score and sub-scores. Twenty-five patients (mean disease duration [standard deviation] = 3.32 [1.79]) were classified as PSP-RS, while sixteen as PSP-P (mean disease duration [standard deviation] = 3.47 [2.00]). These subgroups had homogeneous demographical and clinical characteristics, including disease severity quantified by the NNIPPS-PPS total score. Only the NNIPPS-PPS oculomotor function sub-score significantly differed, being more impaired in PSP-RS patients. No significant differences were found in all speech variables between the two groups. Speech evaluation is not a distinguishing feature of PSP subtypes in mid-stage disease.

2.
Acta Neurochir (Wien) ; 166(1): 369, 2024 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-39283500

RESUMO

BACKGROUND: Speech changes significantly impact the quality of life for Parkinson's disease (PD) patients. Deep Brain Stimulation (DBS) of the Subthalamic Nucleus (STN) is a standard treatment for advanced PD, but its effects on speech remain unclear. This study aimed to investigate the relationship between STN-DBS and speech changes in PD patients using comprehensive clinical assessments and tractography. METHODS: Forty-seven PD patients underwent STN-DBS, with preoperative and 3-month postoperative assessments. Speech analyses included acoustic measurements, auditory-perceptual evaluations, and fluency-intelligibility tests. On the other hand, structures within the volume tissue activated (VTA) were identified using MRI and DTI. The clinical and demographic data and structures associated with VTA (Corticospinal tract, Internal capsule, Dentato-rubro-thalamic tract, Medial forebrain bundle, Medial lemniscus, Substantia nigra, Red nucleus) were compared with speech analyses. RESULTS: The majority of patients (36.2-55.4% good, 29.7-53.1% same) exhibited either improved or unchanged speech quality following STN-DBS. Only a small percentage (8.5-14.9%) experienced deterioration. Older patients and those with worsened motor symptoms postoperatively were more likely to experience negative speech changes (p < 0.05). Interestingly, stimulation of the right Substantia Nigra correlated with improved speech quality (p < 0.05). No significant relationship was found between other structures affected by VTA and speech changes. CONCLUSIONS: This study suggests that STN-DBS does not predominantly negatively impact speech in PD patients, with potential benefits observed, especially in younger patients. These findings underscore the importance of individualized treatment approaches and highlight the need for further long-term studies to optimize therapeutic outcomes and better understand the effects of STN-DBS on speech.


Assuntos
Estimulação Encefálica Profunda , Imagem de Tensor de Difusão , Doença de Parkinson , Fala , Núcleo Subtalâmico , Humanos , Núcleo Subtalâmico/diagnóstico por imagem , Núcleo Subtalâmico/cirurgia , Estimulação Encefálica Profunda/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Doença de Parkinson/terapia , Doença de Parkinson/diagnóstico por imagem , Idoso , Imagem de Tensor de Difusão/métodos , Estudos Prospectivos , Fala/fisiologia , Distúrbios da Fala/etiologia , Resultado do Tratamento , Adulto
3.
J Med Internet Res ; 26: e58572, 2024 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-39324329

RESUMO

BACKGROUND: While speech analysis holds promise for mental health assessment, research often focuses on single symptoms, despite symptom co-occurrences and interactions. In addition, predictive models in mental health do not properly assess the limitations of speech-based systems, such as uncertainty, or fairness for a safe clinical deployment. OBJECTIVE: We investigated the predictive potential of mobile-collected speech data for detecting and estimating depression, anxiety, fatigue, and insomnia, focusing on other factors than mere accuracy, in the general population. METHODS: We included 865 healthy adults and recorded their answers regarding their perceived mental and sleep states. We asked how they felt and if they had slept well lately. Clinically validated questionnaires measuring depression, anxiety, insomnia, and fatigue severity were also used. We developed a novel speech and machine learning pipeline involving voice activity detection, feature extraction, and model training. We automatically modeled speech with pretrained deep learning models that were pretrained on a large, open, and free database, and we selected the best one on the validation set. Based on the best speech modeling approach, clinical threshold detection, individual score prediction, model uncertainty estimation, and performance fairness across demographics (age, sex, and education) were evaluated. We used a train-validation-test split for all evaluations: to develop our models, select the best ones, and assess the generalizability of held-out data. RESULTS: The best model was Whisper M with a max pooling and oversampling method. Our methods achieved good detection performance for all symptoms, depression (Patient Health Questionnaire-9: area under the curve [AUC]=0.76; F1-score=0.49 and Beck Depression Inventory: AUC=0.78; F1-score=0.65), anxiety (Generalized Anxiety Disorder 7-item scale: AUC=0.77; F1-score=0.50), insomnia (Athens Insomnia Scale: AUC=0.73; F1-score=0.62), and fatigue (Multidimensional Fatigue Inventory total score: AUC=0.68; F1-score=0.88). The system performed well when it needed to abstain from making predictions, as demonstrated by low abstention rates in depression detection with the Beck Depression Inventory and fatigue, with risk-coverage AUCs below 0.4. Individual symptom scores were accurately predicted (correlations were all significant with Pearson strengths between 0.31 and 0.49). Fairness analysis revealed that models were consistent for sex (average disparity ratio [DR] 0.86, SD 0.13), to a lesser extent for education level (average DR 0.47, SD 0.30), and worse for age groups (average DR 0.33, SD 0.30). CONCLUSIONS: This study demonstrates the potential of speech-based systems for multifaceted mental health assessment in the general population, not only for detecting clinical thresholds but also for estimating their severity. Addressing fairness and incorporating uncertainty estimation with selective classification are key contributions that can enhance the clinical utility and responsible implementation of such systems.


Assuntos
Ansiedade , Depressão , Fadiga , Distúrbios do Início e da Manutenção do Sono , Humanos , Adulto , Masculino , Feminino , Distúrbios do Início e da Manutenção do Sono/diagnóstico , Distúrbios do Início e da Manutenção do Sono/psicologia , Depressão/diagnóstico , Depressão/psicologia , Fadiga/diagnóstico , Fadiga/psicologia , Ansiedade/diagnóstico , Ansiedade/psicologia , Pessoa de Meia-Idade , Algoritmos , Fala , Inquéritos e Questionários , Adulto Jovem
4.
Br J Neurosurg ; : 1-9, 2024 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-39412253

RESUMO

BACKGROUND: Patients with glioma often report language complaints with devastating effect on daily life. Analysing spontaneous speech can help to understand underlying language problems. Spontaneous speech monitoring is also of importance during awake brain surgery: it can guide tumour resection and contributes to maintaining language function. We aimed to investigate the spontaneous speech of patients with glioma in the perioperative period and the additional value of spontaneous speech analyses compared to standardised language testing. METHODS: We elicited and transcribed spontaneous speech of eight patients with glioma elected for awake brain surgery preoperatively, intraoperatively and 2.0-3.5 months postoperatively. Linguistic errors were coded. Type Token Ratio, Mean Length of Utterance of words, minimal utterances, and errors were extracted from the transcriptions. Patients were categorised based on total error patterns: stable, decrease or increase during surgery. Reliable Change Index scores were calculated for all spontaneous speech variables to objectify changes between time points. Language performance on language tests was compared to spontaneous speech variables. RESULTS: Most errors occurred in lexico-syntax, followed by phonology/articulation, syntax, and semantics. The predominant errors were Repetitions, Self-corrections, and Incomplete sentences. Most patients remained stable over time in almost all spontaneous speech variables, except in Incomplete sentences, which deteriorated in most patients postoperatively compared to intraoperatively. Some spontaneous speech variables (total errors, MLUw, TTR) gave more information on language change than a standard language test. CONCLUSIONS: While the course of spontaneous speech over time remained relatively stable in most patients, Incomplete sentences seems to be a robust marker of language difficulties patients with glioma. These errors can be prioritised in spontaneous speech analysis to save time, especially to determine intra- to postoperative deterioration. Importantly, spontaneous speech analyses can give more information on language change than standardised language testing and should therefore be used in addition to standardised language tests.

5.
Sensors (Basel) ; 24(5)2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38475034

RESUMO

Parkinson's disease (PD) is a neurodegenerative disorder characterized by a range of motor and non-motor symptoms. One of the notable non-motor symptoms of PD is the presence of vocal disorders, attributed to the underlying pathophysiological changes in the neural control of the laryngeal and vocal tract musculature. From this perspective, the integration of machine learning (ML) techniques in the analysis of speech signals has significantly contributed to the detection and diagnosis of PD. Particularly, MEL Frequency Cepstral Coefficients (MFCCs) and Gammatone Frequency Cepstral Coefficients (GTCCs) are both feature extraction techniques commonly used in the field of speech and audio signal processing that could exhibit great potential for vocal disorder identification. This study presents a novel approach to the early detection of PD through ML applied to speech analysis, leveraging both MFCCs and GTCCs. The recordings contained in the Mobile Device Voice Recordings at King's College London (MDVR-KCL) dataset were used. These recordings were collected from healthy individuals and PD patients while they read a passage and during a spontaneous conversation on the phone. Particularly, the speech data regarding the spontaneous dialogue task were processed through speaker diarization, a technique that partitions an audio stream into homogeneous segments according to speaker identity. The ML applied to MFCCS and GTCCs allowed us to classify PD patients with a test accuracy of 92.3%. This research further demonstrates the potential to employ mobile phones as a non-invasive, cost-effective tool for the early detection of PD, significantly improving patient prognosis and quality of life.


Assuntos
Doença de Parkinson , Fala , Humanos , Doença de Parkinson/diagnóstico , Qualidade de Vida , Aprendizado de Máquina , Músculos Laríngeos
6.
Alzheimers Dement ; 20(5): 3416-3428, 2024 05.
Artigo em Inglês | MEDLINE | ID: mdl-38572850

RESUMO

INTRODUCTION: Screening for Alzheimer's disease neuropathologic change (ADNC) in individuals with atypical presentations is challenging but essential for clinical management. We trained automatic speech-based classifiers to distinguish frontotemporal dementia (FTD) patients with ADNC from those with frontotemporal lobar degeneration (FTLD). METHODS: We trained automatic classifiers with 99 speech features from 1 minute speech samples of 179 participants (ADNC = 36, FTLD = 60, healthy controls [HC] = 89). Patients' pathology was assigned based on autopsy or cerebrospinal fluid analytes. Structural network-based magnetic resonance imaging analyses identified anatomical correlates of distinct speech features. RESULTS: Our classifier showed 0.88 ± $ \pm $ 0.03 area under the curve (AUC) for ADNC versus FTLD and 0.93 ± $ \pm $ 0.04 AUC for patients versus HC. Noun frequency and pause rate correlated with gray matter volume loss in the limbic and salience networks, respectively. DISCUSSION: Brief naturalistic speech samples can be used for screening FTD patients for underlying ADNC in vivo. This work supports the future development of digital assessment tools for FTD. HIGHLIGHTS: We trained machine learning classifiers for frontotemporal dementia patients using natural speech. We grouped participants by neuropathological diagnosis (autopsy) or cerebrospinal fluid biomarkers. Classifiers well distinguished underlying pathology (Alzheimer's disease vs. frontotemporal lobar degeneration) in patients. We identified important features through an explainable artificial intelligence approach. This work lays the groundwork for a speech-based neuropathology screening tool.


Assuntos
Doença de Alzheimer , Demência Frontotemporal , Imageamento por Ressonância Magnética , Fala , Humanos , Feminino , Doença de Alzheimer/patologia , Masculino , Idoso , Demência Frontotemporal/patologia , Fala/fisiologia , Pessoa de Meia-Idade , Fenótipo , Degeneração Lobar Frontotemporal/patologia , Aprendizado de Máquina
7.
Cerebellum ; 22(4): 761-775, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35761144

RESUMO

Multiple sclerosis (MS) is a progressive disease that often affects the cerebellum. It is characterised by demyelination, inflammation, and neurodegeneration within the central nervous system. Damage to the cerebellum in MS is associated with increased disability and decreased quality of life. Symptoms include gait and balance problems, motor speech disorder, upper limb dysfunction, and oculomotor difficulties. Monitoring symptoms is crucial for effective management of MS. A combination of clinical, neuroimaging, and task-based measures is generally used to diagnose and monitor MS. This paper reviews the present and new tools used by clinicians and researchers to assess cerebellar impairment in people with MS (pwMS). It also describes recent advances in digital and home-based monitoring for people with MS.


Assuntos
Doenças Cerebelares , Esclerose Múltipla , Humanos , Esclerose Múltipla/complicações , Esclerose Múltipla/diagnóstico por imagem , Qualidade de Vida , Cerebelo/diagnóstico por imagem , Marcha
8.
J Med Internet Res ; 25: e34474, 2023 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-36696160

RESUMO

BACKGROUND: Automatic diagnosis of depression based on speech can complement mental health treatment methods in the future. Previous studies have reported that acoustic properties can be used to identify depression. However, few studies have attempted a large-scale differential diagnosis of patients with depressive disorders using acoustic characteristics of non-English speakers. OBJECTIVE: This study proposes a framework for automatic depression detection using large-scale acoustic characteristics based on the Korean language. METHODS: We recruited 153 patients who met the criteria for major depressive disorder and 165 healthy controls without current or past mental illness. Participants' voices were recorded on a smartphone while performing the task of reading predefined text-based sentences. Three approaches were evaluated and compared to detect depression using data sets with text-dependent read speech tasks: conventional machine learning models based on acoustic features, a proposed model that trains and classifies log-Mel spectrograms by applying a deep convolutional neural network (CNN) with a relatively small number of parameters, and models that train and classify log-Mel spectrograms by applying well-known pretrained networks. RESULTS: The acoustic characteristics of the predefined text-based sentence reading automatically detected depression using the proposed CNN model. The highest accuracy achieved with the proposed CNN on the speech data was 78.14%. Our results show that the deep-learned acoustic characteristics lead to better performance than those obtained using the conventional approach and pretrained models. CONCLUSIONS: Checking the mood of patients with major depressive disorder and detecting the consistency of objective descriptions are very important research topics. This study suggests that the analysis of speech data recorded while reading text-dependent sentences could help predict depression status automatically by capturing the characteristics of depression. Our method is smartphone based, is easily accessible, and can contribute to the automatic identification of depressive states.


Assuntos
Transtorno Depressivo Maior , Fala , Humanos , Depressão/diagnóstico , Transtorno Depressivo Maior/diagnóstico , Smartphone , Redes Neurais de Computação
9.
Aging Ment Health ; : 1-10, 2023 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-37970813

RESUMO

OBJECTIVES: To examine the association between speech and facial features with depression, anxiety, and apathy in older adults with mild cognitive impairment (MCI). METHODS: Speech and facial expressions of 319 MCI patients were digitally recorded via audio and video recording software. Three of the most common neuropsychiatric symptoms (NPS) were evaluated by the Public Health Questionnaire, General Anxiety Disorder, and Apathy Evaluation Scale, respectively. Speech and facial features were extracted using the open-source data analysis toolkits. Machine learning techniques were used to validate the diagnostic power of extracted features. RESULTS: Different speech and facial features were associated with specific NPS. Depression was associated with spectral and temporal features, anxiety and apathy with frequency, energy, spectral, and temporal features. Additionally, depression was associated with facial features (action unit, AU) 10, 12, 15, 17, 25, anxiety with AU 10, 15, 17, 25, 26, 45, and apathy with AU 5, 26, 45. Significant differences in speech and facial features were observed between males and females. Based on machine learning models, the highest accuracy for detecting depression, anxiety, and apathy reached 95.8%, 96.1%, and 83.3% for males, and 87.8%, 88.2%, and 88.6% for females, respectively. CONCLUSION: Depression, anxiety, and apathy were characterized by distinct speech and facial features. The machine learning model developed in this study demonstrated good classification in detecting depression, anxiety, and apathy. A combination of audio and video may provide objective methods for the precise classification of these symptoms.

10.
Psychiatr Danub ; 35(Suppl 2): 77-85, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37800207

RESUMO

BACKGROUND: Depression is a common mental illness, with around 280 million people suffering from depression worldwide. At present, the main way to quantify the severity of depression is through psychometric scales, which entail subjectivity on the part of both patient and clinician. In the last few years, deep (machine) learning is emerging as a more objective approach for measuring depression severity. We now investigate how neural networks might serve for the early diagnosis of depression. SUBJECTS AND METHODS: We searched Medline (Pubmed) for articles published up to June 1, 2023. The search term included Depression AND Diagnostics AND Artificial Intelligence. We did not search for depression studies of machine learning other than neural networks, and selected only those papers attesting to diagnosis or screening for depression. RESULTS: Fifty-four papers met our criteria, among which 14 using facial expression recordings, 14 using EEG, 5 using fMRI, and 5 using audio speech recording analysis, whereas 6 used multimodality approach, two were the text analysis studies, and 8 used other methods. CONCLUSIONS: Research methodologies include both audio and video recordings of clinical interviews, task performance, including their subsequent conversion into text, and resting state studies (EEG, MRI, fMRI). Convolutional neural networks (CNN), including 3D-CNN and 2D-CNN, can obtain diagnostic data from the videos of the facial area. Deep learning in relation to EEG signals is the most commonly used CNN. fMRI approaches use graph convolutional networks and 3D-CNN with voxel connectivity, whereas the text analyses use CNNs, including LSTM (long/short-term memory). Audio recordings are analyzed by a hybrid CNN and support vector machine model. Neural networks are used to analyze biomaterials, gait, polysomnography, ECG, data from wrist wearable devices, and present illness history records. Multimodality studies analyze the fusion of audio features with visual and textual features using LSTM and CNN architectures, a temporal convolutional network, or a recurrent neural network. The accuracy of different hybrid and multimodality models is 78-99%, relative to the standard clinical diagnoses.


Assuntos
Inteligência Artificial , Depressão , Humanos , Depressão/diagnóstico , Redes Neurais de Computação , Aprendizado de Máquina , Diagnóstico Precoce
11.
Artigo em Inglês | MEDLINE | ID: mdl-36284449

RESUMO

OBJECTIVES: This study aimed to develop a classification model to detect and distinguish apathy and depression based on text, audio, and video features and to make use of the shapely additive explanations (SHAP) toolkit to increase the model interpretability. METHODS: Subjective scales and objective experiments were conducted on 319 mild cognitive impairment (MCI) patients to measure apathy and depression. The MCI patients were classified into four groups, depression only, apathy only, depressed-apathetic, and the normal group. Speech, facial and text features were extracted using the open-source data analysis toolkits. Multiclass classification and SHAP toolkits were used to develop a classification model and explain the contribution of specific features. RESULTS: The macro-averaged f1 score and accuracy for overall model were 0.91 and 0.90, respectively. The accuracy for the apathetic, depressed, depressed-apathetic, and normal groups were 0.98, 0.88, 0.93, and 0.82, respectively. The SHAP toolkit identified speech features (Mel-frequency cepstral coefficient (MFCC) 4, spectral slopes, F0, F1), facial features (action unit (AU) 14, 26, 28, 45), and text feature (text 6 semantic) associated with apathy. Meanwhile, speech features (spectral slopes, shimmer, F0) and facial expression (AU 2, 6, 7, 10, 14, 26, 45) were associated with depression. Apart from the shared features mentioned above, new speech (MFCC 2, loudness) and facial (AU 9) features were observed in the depressive-apathetic group. CONCLUSIONS: Apathy and depression shared some verbal and facial features while also exhibited distinct features. A combination of text, audio, and video could be used to improve the early detection and differential diagnosis of apathy and depression in MCI patients.


Assuntos
Apatia , Disfunção Cognitiva , Humanos , Idoso , Depressão/diagnóstico , Depressão/psicologia , Disfunção Cognitiva/diagnóstico , Disfunção Cognitiva/psicologia , Testes Neuropsicológicos
12.
BMC Psychiatry ; 22(1): 830, 2022 12 27.
Artigo em Inglês | MEDLINE | ID: mdl-36575442

RESUMO

BACKGROUND: Automated speech analysis has gained increasing attention to help diagnosing depression. Most previous studies, however, focused on comparing speech in patients with major depressive disorder to that in healthy volunteers. An alternative may be to associate speech with depressive symptoms in a non-clinical sample as this may help to find early and sensitive markers in those at risk of depression. METHODS: We included n = 118 healthy young adults (mean age: 23.5 ± 3.7 years; 77% women) and asked them to talk about a positive and a negative event in their life. Then, we assessed the level of depressive symptoms with a self-report questionnaire, with scores ranging from 0-60. We transcribed speech data and extracted acoustic as well as linguistic features. Then, we tested whether individuals below or above the cut-off of clinically relevant depressive symptoms differed in speech features. Next, we predicted whether someone would be below or above that cut-off as well as the individual scores on the depression questionnaire. Since depression is associated with cognitive slowing or attentional deficits, we finally correlated depression scores with performance in the Trail Making Test. RESULTS: In our sample, n = 93 individuals scored below and n = 25 scored above cut-off for clinically relevant depressive symptoms. Most speech features did not differ significantly between both groups, but individuals above cut-off spoke more than those below that cut-off in the positive and the negative story. In addition, higher depression scores in that group were associated with slower completion time of the Trail Making Test. We were able to predict with 93% accuracy who would be below or above cut-off. In addition, we were able to predict the individual depression scores with low mean absolute error (3.90), with best performance achieved by a support vector machine. CONCLUSIONS: Our results indicate that even in a sample without a clinical diagnosis of depression, changes in speech relate to higher depression scores. This should be investigated in more detail in the future. In a longitudinal study, it may be tested whether speech features found in our study represent early and sensitive markers for subsequent depression in individuals at risk.


Assuntos
Transtorno Depressivo Maior , Adulto Jovem , Humanos , Feminino , Adulto , Masculino , Transtorno Depressivo Maior/diagnóstico , Transtorno Depressivo Maior/psicologia , Depressão/diagnóstico , Estudos Longitudinais , Fala , Inquéritos e Questionários
13.
Mov Disord ; 36(12): 2862-2873, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34390508

RESUMO

BACKGROUND: Dysarthric symptoms in Parkinson's disease (PD) vary greatly across cohorts. Abundant research suggests that such heterogeneity could reflect subject-level and task-related cognitive factors. However, the interplay of these variables during motor speech remains underexplored, let alone by administering validated materials to carefully matched samples with varying cognitive profiles and combining automated tools with machine learning methods. OBJECTIVE: We aimed to identify which speech dimensions best identify patients with PD in cognitively heterogeneous, cognitively preserved, and cognitively impaired groups through tasks with low (reading) and high (retelling) processing demands. METHODS: We used support vector machines to analyze prosodic, articulatory, and phonemic identifiability features. Patient groups were compared with healthy control subjects and against each other in both tasks, using each measure separately and in combination. RESULTS: Relative to control subjects, patients in cognitively heterogeneous and cognitively preserved groups were best discriminated by combined dysarthric signs during reading (accuracy = 84% and 80.2%). Conversely, patients with cognitive impairment were maximally discriminated from control subjects when considering phonemic identifiability during retelling (accuracy = 86.9%). This same pattern maximally distinguished between cognitively spared and impaired patients (accuracy = 72.1%). Also, cognitive (executive) symptom severity was predicted by prosody in cognitively preserved patients and by phonemic identifiability in cognitively heterogeneous and impaired groups. No measure predicted overall motor dysfunction in any group. CONCLUSIONS: Predominant dysarthric symptoms appear to be best captured through undemanding tasks in cognitively heterogeneous and preserved cohorts and through cognitively loaded tasks in patients with cognitive impairment. Further applications of this framework could enhance dysarthria assessments in PD. © 2021 International Parkinson and Movement Disorder Society.


Assuntos
Disfunção Cognitiva , Doença de Parkinson , Cognição , Disartria/diagnóstico , Disartria/etiologia , Humanos , Aprendizado de Máquina , Fala
14.
J Med Internet Res ; 23(4): e27667, 2021 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-33830066

RESUMO

BACKGROUND: With the rapid growth of the older adult population worldwide, car accidents involving this population group have become an increasingly serious problem. Cognitive impairment, which is assessed using neuropsychological tests, has been reported as a risk factor for being involved in car accidents; however, it remains unclear whether this risk can be predicted using daily behavior data. OBJECTIVE: The objective of this study was to investigate whether speech data that can be collected in everyday life can be used to predict the risk of an older driver being involved in a car accident. METHODS: At baseline, we collected (1) speech data during interactions with a voice assistant and (2) cognitive assessment data-neuropsychological tests (Mini-Mental State Examination, revised Wechsler immediate and delayed logical memory, Frontal Assessment Battery, trail making test-parts A and B, and Clock Drawing Test), Geriatric Depression Scale, magnetic resonance imaging, and demographics (age, sex, education)-from older adults. Approximately one-and-a-half years later, we followed up to collect information about their driving experiences (with respect to car accidents) using a questionnaire. We investigated the association between speech data and future accident risk using statistical analysis and machine learning models. RESULTS: We found that older drivers (n=60) with accident or near-accident experiences had statistically discernible differences in speech features that suggest cognitive impairment such as reduced speech rate (P=.048) and increased response time (P=.040). Moreover, the model that used speech features could predict future accident or near-accident experiences with 81.7% accuracy, which was 6.7% higher than that using cognitive assessment data, and could achieve up to 88.3% accuracy when the model used both types of data. CONCLUSIONS: Our study provides the first empirical results that suggest analysis of speech data recorded during interactions with voice assistants could help predict future accident risk for older drivers by capturing subtle impairments in cognitive function.


Assuntos
Condução de Veículo , Fala , Acidentes de Trânsito , Idoso , Humanos , Testes Neuropsicológicos , Estudos Prospectivos
15.
J Med Internet Res ; 23(10): e26305, 2021 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-34665148

RESUMO

BACKGROUND: Access to neurological care for Parkinson disease (PD) is a rare privilege for millions of people worldwide, especially in resource-limited countries. In 2013, there were just 1200 neurologists in India for a population of 1.3 billion people; in Africa, the average population per neurologist exceeds 3.3 million people. In contrast, 60,000 people receive a diagnosis of PD every year in the United States alone, and similar patterns of rising PD cases-fueled mostly by environmental pollution and an aging population-can be seen worldwide. The current projection of more than 12 million patients with PD worldwide by 2040 is only part of the picture given that more than 20% of patients with PD remain undiagnosed. Timely diagnosis and frequent assessment are key to ensure timely and appropriate medical intervention, thus improving the quality of life of patients with PD. OBJECTIVE: In this paper, we propose a web-based framework that can help anyone anywhere around the world record a short speech task and analyze the recorded data to screen for PD. METHODS: We collected data from 726 unique participants (PD: 262/726, 36.1% were women; non-PD: 464/726, 63.9% were women; average age 61 years) from all over the United States and beyond. A small portion of the data (approximately 54/726, 7.4%) was collected in a laboratory setting to compare the performance of the models trained with noisy home environment data against high-quality laboratory-environment data. The participants were instructed to utter a popular pangram containing all the letters in the English alphabet, "the quick brown fox jumps over the lazy dog." We extracted both standard acoustic features (mel-frequency cepstral coefficients and jitter and shimmer variants) and deep learning-based embedding features from the speech data. Using these features, we trained several machine learning algorithms. We also applied model interpretation techniques such as Shapley additive explanations to ascertain the importance of each feature in determining the model's output. RESULTS: We achieved an area under the curve of 0.753 for determining the presence of self-reported PD by modeling the standard acoustic features through the XGBoost-a gradient-boosted decision tree model. Further analysis revealed that the widely used mel-frequency cepstral coefficient features and a subset of previously validated dysphonia features designed for detecting PD from a verbal phonation task (pronouncing "ahh") influence the model's decision the most. CONCLUSIONS: Our model performed equally well on data collected in a controlled laboratory environment and in the wild across different gender and age groups. Using this tool, we can collect data from almost anyone anywhere with an audio-enabled device and help the participants screen for PD remotely, contributing to equity and access in neurological care.


Assuntos
Disfonia , Doença de Parkinson , Idoso , Humanos , Internet , Doença de Parkinson/diagnóstico , Doença de Parkinson/epidemiologia , Qualidade de Vida , Fala
16.
Sensors (Basel) ; 21(2)2021 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-33445499

RESUMO

The factors affecting the penetration of certain diseases such as COVID-19 in society are still unknown. Internet of Things (IoT) technologies can play a crucial role during the time of crisis and they can provide a more holistic view of the reasons that govern the outbreak of a contagious disease. The understanding of COVID-19 will be enriched by the analysis of data related to the phenomena, and this data can be collected using IoT sensors. In this paper, we show an integrated solution based on IoT technologies that can serve as opportunistic health data acquisition agents for combating the pandemic of COVID-19, named CIoTVID. The platform is composed of four layers-data acquisition, data aggregation, machine intelligence and services, within the solution. To demonstrate its validity, the solution has been tested with a use case based on creating a classifier of medical conditions using real data of voice, performing successfully. The layer of data aggregation is particularly relevant in this kind of solution as the data coming from medical devices has a very different nature to that coming from electronic sensors. Due to the adaptability of the platform to heterogeneous data and volumes of data; individuals, policymakers, and clinics could benefit from it to fight the propagation of the pandemic.


Assuntos
COVID-19 , Internet das Coisas , Processamento de Sinais Assistido por Computador , Inteligência Artificial , COVID-19/complicações , COVID-19/diagnóstico , COVID-19/fisiopatologia , Humanos , Oximetria , Pandemias , SARS-CoV-2 , Espectrografia do Som/métodos , Voz/fisiologia
17.
Sensors (Basel) ; 21(10)2021 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-34067574

RESUMO

With the increased number of Software-Defined Networking (SDN) installations, the data centers of large service providers are becoming more and more agile in terms of network performance efficiency and flexibility. While SDN is an active and obvious trend in a modern data center design, the implications and possibilities it carries for effective and efficient network management are not yet fully explored and utilized. With most of the modern Internet traffic consisting of multimedia services and media-rich content sharing, the quality of multimedia communications is at the center of attention of many companies and research groups. Since SDN-enabled switches have an inherent feature of monitoring the flow statistics in terms of packets and bytes transmitted/lost, these devices can be utilized to monitor the essential statistics of the multimedia communications, allowing the provider to act in case of network failing to deliver the required service quality. The internal packet processing in the SDN switch enables the SDN controller to fetch the statistical information of the particular packet flow using the PacketIn and Multipart messages. This information, if preprocessed properly, can be used to estimate higher layer interpretation of the link quality and thus allowing to relate the provided quality of service (QoS) to the quality of user experience (QoE). This article discusses the experimental setup that can be used to estimate the quality of speech communication based on the information provided by the SDN controller. To achieve higher accuracy of the result, latency characteristics are added based on the exploiting of the dummy packet injection into the packet stream and/or RTCP packet analysis. The results of the experiment show that this innovative approach calculates the statistics of each individual RTP stream, and thus, we obtain a method for dynamic measurement of speech quality, where when quality decreases, it is possible to respond quickly by changing routing at the network level for each individual call. To improve the quality of call measurements, a Convolutional Neural Network (CNN) was also implemented. This model is based on two standard approaches to measuring the speech quality: PESQ and E-model. However, unlike PESQ/POLQA, the CNN-based model can take delay into account, and unlike the E-model, the resulting accuracy is much higher.


Assuntos
Redes de Comunicação de Computadores , Fala , Algoritmos , Aprendizado de Máquina , Software
18.
Sensors (Basel) ; 21(19)2021 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-34640780

RESUMO

Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthria. This new approach exploits the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform, to improve the performance of a speaker-dependent ASR system. The second aim is to define if there exists a correlation among the speaker's voice features and the optimal window and shift parameters that minimises the error of an ASR system, for that specific speaker. For our experiments, we used both impaired and unimpaired Italian speech. Specifically, we used 30 speakers with dysarthria from the IDEA database and 10 professional speakers from the CLIPS database. Both databases are freely available. The results confirm that, if a standard ASR system performs poorly with a speaker with dysarthria, it can be improved by using the new speech analysis. Otherwise, the new approach is ineffective in cases of unimpaired and low impaired speech. Furthermore, there exists a correlation between some speaker's voice features and their optimal parameters.


Assuntos
Disartria , Percepção da Fala , Humanos , Fala , Distúrbios da Fala , Interface para o Reconhecimento da Fala
19.
Sensors (Basel) ; 21(4)2021 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-33668544

RESUMO

Surgeons' procedural skills and intraoperative decision making are key elements of clinical practice. However, the objective assessment of these skills remains a challenge to this day. Surgical workflow analysis (SWA) is emerging as a powerful tool to solve this issue in surgical educational environments in real time. Typically, SWA makes use of video signals to automatically identify the surgical phase. We hypothesize that the analysis of surgeons' speech using natural language processing (NLP) can provide deeper insight into the surgical decision-making processes. As a preliminary step, this study proposes to use audio signals registered in the educational operating room (OR) to classify the phases of a laparoscopic cholecystectomy (LC). To do this, we firstly created a database with the transcriptions of audio recorded in surgical educational environments and their corresponding phase. Secondly, we compared the performance of four feature extraction techniques and four machine learning models to find the most appropriate model for phase recognition. The best resulting model was a support vector machine (SVM) coupled to a hidden-Markov model (HMM), trained with features obtained with Word2Vec (82.95% average accuracy). The analysis of this model's confusion matrix shows that some phrases are misplaced due to the similarity in the words used. The study of the model's temporal component suggests that further attention should be paid to accurately detect surgeons' normal conversation. This study proves that speech-based classification of LC phases can be effectively achieved. This lays the foundation for the use of audio signals for SWA, to create a framework of LC to be used in surgical training, especially for the training and assessment of procedural and decision-making skills (e.g., to assess residents' procedural knowledge and their ability to react to adverse situations).


Assuntos
Colecistectomia Laparoscópica , Competência Clínica , Cirurgia Geral , Reconhecimento Automatizado de Padrão , Cirurgia Geral/normas , Humanos , Salas Cirúrgicas , Fala
20.
Int J Lang Commun Disord ; 55(2): 165-187, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-32077212

RESUMO

BACKGROUND: There is no consensus in the UK regarding the types of speech samples or parameters of speech that should be assessed at 3 years of age in children with cleft palate ± cleft lip (CP±L), despite cleft units routinely assessing speech at this age. The standardization of assessment practices would facilitate comparisons of outcomes across UK cleft units; earlier identification of speech impairments-which could support more timely treatments; and more reliable recording of therapy impacts and surgical interventions. AIMS: To explore assessment practices used to assess speech in 3-year-old children with CP±L, including speech parameters, methods of assessment and the nature of the speech sample used. METHODS & PROCEDURES: A broad examination of the literature was undertaken through the use of a scoping review conducted in accordance with Joanna Briggs Institute guidelines. Search terms were generated from a preliminary search and then used in the main search (Medline, CINAHL, Embase, AMED and PsycINFO). MAIN CONTRIBUTION: A combination of approaches (medical, linguistic, developmental and functional) is required to assess CP±L speech at age 3. A developmental approach is recommended at this age, considering the complexity of speech profiles at age 3, in which typically developing speech processes may occur alongside cleft speech characteristics. A combined measure for both nasal emission and turbulence, and an overall measure for velopharyngeal function for speech, show potential for assessment at this age. Categorical ordinal scales are frequently used; the use of continuous scales has yet to be fully explored at age 3. Although single-word assessments, including a subset of words developed for cross-linguistic comparisons, are frequently used, more than one type of speech sample may be needed to assess speech at this age validly. The lack of consensus regarding speech samples highlights a need for further research into the types of speech samples 3-year-olds can complete; the impact of incomplete speech samples on outcome measures (particularly relevant at this age when children may be less able to complete a full sample); the impact of different speech samples on the validity of assessments; and the reliability of listener judgements. CONCLUSIONS & IMPLICATIONS: Whilst a medical model and linguistic approaches are often central in assessments of age-3 cleft speech, this review highlights the importance of developmental and functional approaches to assessment. Cross-linguistic single-word assessments show potential, and would facilitate the comparison of UK speech outcomes with other countries. Further research should explore the impact of different speech samples and rating scales on assessment validity and listener reliability.


Assuntos
Fissura Palatina/diagnóstico , Distúrbios da Fala/diagnóstico , Fala , Pré-Escolar , Fenda Labial/complicações , Fenda Labial/diagnóstico , Fissura Palatina/complicações , Humanos , Acústica da Fala , Distúrbios da Fala/etiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA