Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 8.881
Filtrar
1.
Artif Intell Med ; 156: 102953, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39222579

RESUMO

BACKGROUND: Chronic obstructive pulmonary disease (COPD) is a severe condition affecting millions worldwide, leading to numerous annual deaths. The absence of significant symptoms in its early stages promotes high underdiagnosis rates for the affected people. Besides pulmonary function failure, another harmful problem of COPD is the systemic effects, e.g., heart failure or voice distortion. However, the systemic effects of COPD might provide valuable information for early detection. In other words, symptoms caused by systemic effects could be helpful to detect the condition in its early stages. OBJECTIVE: The proposed study aims to explore whether the voice features extracted from the vowel "a" utterance carry any information that can be predictive of COPD by employing Machine Learning (ML) on a newly collected voice dataset. METHODS: Forty-eight participants were recruited from the pool of research clinic visitors at Blekinge Institute of Technology (BTH) in Sweden between January 2022 and May 2023. A dataset consisting of 1246 recordings from 48 participants was gathered. The collection of voice recordings containing the vowel "a" utterance commenced following an information and consent meeting with each participant using the VoiceDiagnostic application. The collected voice data was subjected to silence segment removal, feature extraction of baseline acoustic features, and Mel Frequency Cepstrum Coefficients (MFCC). Sociodemographic data was also collected from the participants. Three ML models were investigated for the binary classification of COPD and healthy controls: Random Forest (RF), Support Vector Machine (SVM), and CatBoost (CB). A nested k-fold cross-validation approach was employed. Additionally, the hyperparameters were optimized using grid-search on each ML model. For best performance assessment, accuracy, F1-score, precision, and recall metrics were computed. Afterward, we further examined the best classifier by utilizing the Area Under the Curve (AUC), Average Precision (AP), and SHapley Additive exPlanations (SHAP) feature-importance measures. RESULTS: The classifiers RF, SVM, and CB achieved a maximum accuracy of 77 %, 69 %, and 78 % on the test set and 93 %, 78 % and 97 % on the validation set, respectively. The CB classifier outperformed RF and SVM. After further investigation of the best-performing classifier, CB demonstrated the highest performance, producing an AUC of 82 % and AP of 76 %. In addition to age and gender, the mean values of baseline acoustic and MFCC features demonstrate high importance and deterministic characteristics for classification performance in both test and validation sets, though in varied order. CONCLUSION: This study concludes that the utterance of vowel "a" recordings contain information that can be captured by the CatBoost classifier with high accuracy for the classification of COPD. Additionally, baseline acoustic and MFCC features, in conjunction with age and gender information, can be employed for classification purposes and benefit healthcare for decision support in COPD diagnosis. CLINICAL TRIAL REGISTRATION NUMBER: NCT05897944.


Assuntos
Aprendizado de Máquina , Doença Pulmonar Obstrutiva Crônica , Doença Pulmonar Obstrutiva Crônica/classificação , Doença Pulmonar Obstrutiva Crônica/fisiopatologia , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Humanos , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Voz/fisiologia , Máquina de Vetores de Suporte
2.
Sci Rep ; 14(1): 21313, 2024 09 12.
Artigo em Inglês | MEDLINE | ID: mdl-39266561

RESUMO

Extensive research with musicians has shown that instrumental musical training can have a profound impact on how acoustic features are processed in the brain. However, less is known about the influence of singing training on neural activity during voice perception, particularly in response to salient acoustic features, such as the vocal vibrato in operatic singing. To address this gap, the present study employed functional magnetic resonance imaging (fMRI) to measure brain responses in trained opera singers and musically untrained controls listening to recordings of opera singers performing in two distinct styles: a full operatic voice with vibrato, and a straight voice without vibrato. Results indicated that for opera singers, perception of operatic voice led to differential fMRI activations in bilateral auditory cortical regions and the default mode network. In contrast, musically untrained controls exhibited differences only in bilateral auditory cortex. These results suggest that operatic singing training triggers experience-dependent neural changes in the brain that activate self-referential networks, possibly through embodiment of acoustic features associated with one's own singing style.


Assuntos
Imageamento por Ressonância Magnética , Canto , Humanos , Canto/fisiologia , Masculino , Feminino , Adulto , Adulto Jovem , Percepção Auditiva/fisiologia , Música , Rede de Modo Padrão/fisiologia , Córtex Auditivo/fisiologia , Córtex Auditivo/diagnóstico por imagem , Voz/fisiologia , Mapeamento Encefálico , Encéfalo/fisiologia , Encéfalo/diagnóstico por imagem
3.
Cereb Cortex ; 34(9)2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-39270675

RESUMO

The human auditory system includes discrete cortical patches and selective regions for processing voice information, including emotional prosody. Although behavioral evidence indicates individuals with autism spectrum disorder (ASD) have difficulties in recognizing emotional prosody, it remains understudied whether and how localized voice patches (VPs) and other voice-sensitive regions are functionally altered in processing prosody. This fMRI study investigated neural responses to prosodic voices in 25 adult males with ASD and 33 controls using voices of anger, sadness, and happiness with varying degrees of emotion. We used a functional region-of-interest analysis with an independent voice localizer to identify multiple VPs from combined ASD and control data. We observed a general response reduction to prosodic voices in specific VPs of left posterior temporal VP (TVP) and right middle TVP. Reduced cortical responses in right middle TVP were consistently correlated with the severity of autistic symptoms for all examined emotional prosodies. Moreover, representation similarity analysis revealed the reduced effect of emotional intensity in multivoxel activation patterns in left anterior superior temporal cortex only for sad prosody. These results indicate reduced response magnitudes to voice prosodies in specific TVPs and altered emotion intensity-dependent multivoxel activation patterns in adult ASDs, potentially underlying their socio-communicative difficulties.


Assuntos
Transtorno do Espectro Autista , Emoções , Imageamento por Ressonância Magnética , Lobo Temporal , Voz , Humanos , Masculino , Transtorno do Espectro Autista/fisiopatologia , Transtorno do Espectro Autista/diagnóstico por imagem , Transtorno do Espectro Autista/psicologia , Lobo Temporal/fisiopatologia , Lobo Temporal/diagnóstico por imagem , Adulto , Emoções/fisiologia , Adulto Jovem , Percepção da Fala/fisiologia , Mapeamento Encefálico/métodos , Estimulação Acústica , Percepção Auditiva/fisiologia
4.
Nihon Ronen Igakkai Zasshi ; 61(3): 337-344, 2024.
Artigo em Japonês | MEDLINE | ID: mdl-39261104

RESUMO

AIM: An easy-to-use tool that can detect cognitive decline in mild cognitive impairment (MCI) is required. In this study, we aimed to construct a machine learning model that discriminates between MCI and cognitively normal (CN) individuals using spoken answers to questions and speech features. METHODS: Participants of ≥50 years of age were recruited from the Silver Human Resource Center. The Japanese Version of the Mini-Mental State Examination (MMSE-J) and Clinical Dementia Rating (CDR) were used to obtain clinical information. We developed a research application that presented neuropsychological tasks via automated voice guidance and collected the participants' spoken answers. The neuropsychological tasks included time orientation, sentence memory tasks (immediate and delayed recall), and digit span memory-updating tasks. Scores and speech features were obtained from spoken answers. Subsequently, a machine learning model was constructed to classify MCI and CN using various classifiers, combining the participants' age, gender, scores, and speech features. RESULTS: We obtained a model using Gaussian Naive Bayes, which classified typical MCI (CDR 0.5, MMSE ≤26) and typical CN (CDR 0 and MMSE ≥29) with an area under the curve (AUC) of 0.866 (accuracy 0.75, sensitivity 0.857, specificity 0.712). CONCLUSIONS: We built a machine learning model that can classify MCI and CN using spoken answers to neuropsychological questions. Easy-to-use MCI detection tools could be developed by incorporating this model into smartphone applications and telephone services.


Assuntos
Disfunção Cognitiva , Humanos , Disfunção Cognitiva/diagnóstico , Disfunção Cognitiva/classificação , Idoso , Masculino , Feminino , Pessoa de Meia-Idade , Voz , Cognição , Testes Neuropsicológicos , Idoso de 80 Anos ou mais , Aprendizado de Máquina
5.
ACS Appl Mater Interfaces ; 16(38): 51274-51282, 2024 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-39285705

RESUMO

Artificial intelligence and human-computer interaction advances demand bioinspired sensing modalities capable of comprehending human affective states and speech. However, endowing skin-like interfaces with such intricate perception abilities remains challenging. Here, we have developed a flexible piezoresistive artificial ear (AE) sensor based on gold nanoparticles, which can convert sound signals into electrical signals through changes in resistance. By testing the sensor's performance at both frequency and sound pressure level (SPL), the AE has a frequency response range of 20 Hz to 12 kHz and can sense sound signals from up to 5 m away at a frequency of 1 kHz and an SPL of 126 dB. Furthermore, through deep learning, the device achieves up to 96.9% and 95.0% accuracy in classification and recognition applications for seven emotional and eight urban environmental noises, respectively. Hence, on one hand, our device can monitor the patient's emotional state by their speech, such as sudden yelling and screaming, which can help healthcare workers understand patients' condition in time. On the other hand, the device could also be used for real-time monitoring of noise levels in aircraft, ships, factories, and other high-decibel equipment and environments.


Assuntos
Aprendizado Profundo , Emoções , Ouro , Humanos , Emoções/fisiologia , Ouro/química , Nanopartículas Metálicas/química , Voz
6.
JAMA Netw Open ; 7(9): e2435011, 2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-39316400

RESUMO

Importance: Insomnia symptoms affect an estimated 30% to 50% of the 4 million US breast cancer survivors. Previous studies have shown the effectiveness of cognitive behavioral therapy for insomnia (CBT-I), but high insomnia prevalence suggests continued opportunities for delivery via new modalities. Objective: To determine the efficacy of a CBT-I-informed, voice-activated, internet-delivered program for improving insomnia symptoms among breast cancer survivors. Design, Setting, and Participants: In this randomized clinical trial, breast cancer survivors with insomnia (Insomnia Severity Index [ISI] score >7) were recruited from advocacy and survivorship groups and an oncology clinic. Eligible patients were females aged 18 years or older who had completed curative treatment more than 3 months before enrollment and had not undergone other behavioral sleep treatments in the prior year. Individuals were assessed for eligibility and randomized between March 2022 and October 2023, with data collection completed by December 2023. Intervention: Participants were randomized 1:1 to a smart speaker with a voice-interactive CBT-I program or educational control for 6 weeks. Main Outcomes and Measures: Linear mixed models and Cohen d estimates were used to evaluate the primary outcome of changes in ISI scores and secondary outcomes of sleep quality, wake after sleep onset, sleep onset latency, total sleep time, and sleep efficiency. Results: Of 76 women enrolled (38 each in the intervention and control groups), 70 (92.1%) completed the study. Mean (SD) age was 61.2 (9.3) years; 49 (64.5%) were married or partnered, and participants were a mean (SD) of 9.6 (6.8) years from diagnosis. From baseline to follow-up, ISI scores changed by a mean (SD) of -8.4 (4.7) points in the intervention group compared with -2.6 (3.5) in the control group (P < .001) (Cohen d, 1.41; 95% CI, 0.87-1.94). Sleep diary data showed statistically significant improvements in the intervention group compared with the control group for sleep quality (0.56; 95% CI, 0.39-0.74), wake after sleep onset (9.54 minutes; 95% CI, 1.93-17.10 minutes), sleep onset latency (8.32 minutes; 95% CI, 1.91-14.70 minutes), and sleep efficiency (-0.04%; 95% CI, -0.07% to -0.01%) but not for total sleep time (0.01 hours; 95% CI, -0.27 to 0.29 hours). Conclusions and Relevance: This randomized clinical trial of an in-home, voice-activated CBT-I program among breast cancer survivors found that the intervention improved insomnia symptoms. Future studies may explore how this program can be taken to scale and integrated into ambulatory care. Trial Registration: ClinicalTrials.gov Identifier: NCT05233800.


Assuntos
Neoplasias da Mama , Terapia Cognitivo-Comportamental , Distúrbios do Início e da Manutenção do Sono , Humanos , Feminino , Distúrbios do Início e da Manutenção do Sono/terapia , Terapia Cognitivo-Comportamental/métodos , Pessoa de Meia-Idade , Neoplasias da Mama/complicações , Idoso , Sobreviventes de Câncer/psicologia , Resultado do Tratamento , Adulto , Voz
7.
Elife ; 132024 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-39302291

RESUMO

Emotional responsiveness in neonates, particularly their ability to discern vocal emotions, plays an evolutionarily adaptive role in human communication and adaptive behaviors. The developmental trajectory of emotional sensitivity in neonates is crucial for understanding the foundations of early social-emotional functioning. However, the precise onset of this sensitivity and its relationship with gestational age (GA) remain subjects of investigation. In a study involving 120 healthy neonates categorized into six groups based on their GA (ranging from 35 and 40 weeks), we explored their emotional responses to vocal stimuli. These stimuli encompassed disyllables with happy and neutral prosodies, alongside acoustically matched nonvocal control sounds. The assessments occurred during natural sleep states using the odd-ball paradigm and event-related potentials. The results reveal a distinct developmental change at 37 weeks GA, marking the point at which neonates exhibit heightened perceptual acuity for emotional vocal expressions. This newfound ability is substantiated by the presence of the mismatch response, akin to an initial form of adult mismatch negativity, elicited in response to positive emotional vocal prosody. Notably, this perceptual shift's specificity becomes evident when no such discrimination is observed in acoustically matched control sounds. Neonates born before 37 weeks GA do not display this level of discrimination ability. This developmental change has important implications for our understanding of early social-emotional development, highlighting the role of gestational age in shaping early perceptual abilities. Moreover, while these findings introduce the potential for a valuable screening tool for conditions like autism, characterized by atypical social-emotional functions, it is important to note that the current data are not yet robust enough to fully support this application. This study makes a substantial contribution to the broader field of developmental neuroscience and holds promise for future research on early intervention in neurodevelopmental disorders.


Assuntos
Emoções , Idade Gestacional , Humanos , Recém-Nascido , Emoções/fisiologia , Feminino , Masculino , Potenciais Evocados/fisiologia , Estimulação Acústica , Voz/fisiologia , Percepção Auditiva/fisiologia
8.
Sci Rep ; 14(1): 20192, 2024 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-39215070

RESUMO

An automated speaker verification system uses the process of speech recognition to verify the identity of a user and block illicit access. Logical access attacks are efforts to obtain access to a system by tampering with its algorithms or data, or by circumventing security mechanisms. DeepFake attacks are a form of logical access threats that employs artificial intelligence to produce highly realistic audio clips of human voice, that may be used to circumvent vocal authentication systems. This paper presents a framework for the detection of Logical Access and DeepFake audio spoofings by integrating audio file components and time-frequency representation spectrograms into a lower-dimensional space using sequential prediction models. Bidirectional-LSTM trained on the bonafide class generates significant one-dimensional features for both classes. The feature set is then standardized to a fixed set using a novel Bags of Auditory Bites (BoAB) feature standardizing algorithm. The Extreme Learning Machine maps the feature space to predictions that differentiate between genuine and spoofed speeches. The framework is evaluated using the ASVspoof 2021 dataset, a comprehensive collection of audio recordings designed for evaluating the strength of speaker verification systems against spoofing attacks. It achieves favorable results on synthesized DeepFake attacks with an Equal Error Rate (EER) of 1.18% in the most optimal setting. Logical Access attacks were more challenging to detect at an EER of 12.22%. Compared to the state-of-the-arts in the ASVspoof2021 dataset, the proposed method notably improves EER for DeepFake attacks by an improvement rate of 95.16%.


Assuntos
Algoritmos , Humanos , Interface para o Reconhecimento da Fala , Segurança Computacional , Voz , Fala , Inteligência Artificial
9.
Trends Hear ; 28: 23312165241275895, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39212078

RESUMO

Auditory training can lead to notable enhancements in specific tasks, but whether these improvements generalize to untrained tasks like speech-in-noise (SIN) recognition remains uncertain. This study examined how training conditions affect generalization. Fifty-five young adults were divided into "Trained-in-Quiet" (n = 15), "Trained-in-Noise" (n = 20), and "Control" (n = 20) groups. Participants completed two sessions. The first session involved an assessment of SIN recognition and voice discrimination (VD) with word or sentence stimuli, employing combined fundamental frequency (F0) + formant frequencies voice cues. Subsequently, only the trained groups proceeded to an interleaved training phase, encompassing six VD blocks with sentence stimuli, utilizing either F0-only or formant-only cues. The second session replicated the interleaved training for the trained groups, followed by a second assessment conducted by all three groups, identical to the first session. Results showed significant improvements in the trained task regardless of training conditions. However, VD training with a single cue did not enhance VD with both cues beyond control group improvements, suggesting limited generalization. Notably, the Trained-in-Noise group exhibited the most significant SIN recognition improvements posttraining, implying generalization across tasks that share similar acoustic conditions. Overall, findings suggest training conditions impact generalization by influencing processing levels associated with the trained task. Training in noisy conditions may prompt higher auditory and/or cognitive processing than training in quiet, potentially extending skills to tasks involving challenging listening conditions, such as SIN recognition. These insights hold significant theoretical and clinical implications, potentially advancing the development of effective auditory training protocols.


Assuntos
Estimulação Acústica , Sinais (Psicologia) , Generalização Psicológica , Ruído , Percepção da Fala , Humanos , Masculino , Feminino , Adulto Jovem , Percepção da Fala/fisiologia , Ruído/efeitos adversos , Adulto , Reconhecimento Psicológico , Mascaramento Perceptivo , Adolescente , Acústica da Fala , Qualidade da Voz , Aprendizagem por Discriminação/fisiologia , Voz/fisiologia
10.
Int J Med Inform ; 191: 105583, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-39096595

RESUMO

BACKGROUND: Traditional classifier for the classification of diseases, such as K-Nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM), often struggle with high-dimensional medical datasets. OBJECTIVE: This study presents a novel classifier to overcome the limitations of traditional classifiers in Parkinson's disease (PD) detection based on Gower distance. METHODS: We present the Gower distance metric to handle diverse feature sets in voice recordings, which acts as a dissimilarity measure for all feature types, making the model adept at identifying subtle patterns indicative of PD. Additionally, the Cuckoo Search algorithm is employed for feature selection, reducing dimensionality by focusing on key features, thereby lessening the computational load associated with high-dimensional datasets. RESULTS: The proposed classifier based on Gower distance resulted in an accuracy rate of 98.3% with feature selection and achieved an accuracy of 94.92% without the feature selection method. It outperforms traditional classifiers and recent studies in PD detection from voice recordings. CONCLUSIONS: This accuracy shows the capability of the approach in the correct classification of instances and points out the potential of the approach as a reliable diagnostic tool for the medical practitioner. The findings state that the proposed approach holds promise for improving the diagnosis and monitoring of PD, both within medical institutions and at homes for the elderly.


Assuntos
Algoritmos , Doença de Parkinson , Voz , Doença de Parkinson/diagnóstico , Doença de Parkinson/classificação , Humanos , Masculino , Feminino , Idoso , Máquina de Vetores de Suporte , Pessoa de Meia-Idade
11.
Sci Rep ; 14(1): 19012, 2024 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-39198592

RESUMO

Glucose levels in the body have been hypothesized to affect voice characteristics. One of the primary justifications for voice changes are due to Hooke's law, in which a variation in the tension, mass, or length of the vocal folds, mediated by the body's glucose levels, results in an alteration in their vibrational frequency. To explore this hypothesis, 505 participants were fitted with a continuous glucose monitor (CGM) and instructed to record their voice using a custom mobile application up to six times daily for 2 weeks. Glucose values from CGM were paired to voice recordings to create a sampled dataset that closely resembled the glucose profile of the comprehensive CGM dataset. Glucose levels and fundamental frequency (F0) had a significant positive association within an individual, and a 1 mg/dL increase in CGM recorded glucose corresponded to a 0.02 Hz increase in F0 (CI 0.01-0.03 Hz, P < 0.001). This effect was also observed when the participants were split into non-diabetic, prediabetic, and Type 2 Diabetic classifications (P = 0.03, P = 0.01, & P = 0.01 respectively). Vocal F0 increased with blood glucose levels, but future predictive models of glucose levels based on voice may need to be personalized due to high intraclass correlation.


Assuntos
Glicemia , Diabetes Mellitus Tipo 2 , Voz , Humanos , Diabetes Mellitus Tipo 2/sangue , Glicemia/análise , Masculino , Feminino , Pessoa de Meia-Idade , Voz/fisiologia , Adulto , Idoso , Automonitorização da Glicemia/métodos
12.
JMIR Aging ; 7: e55126, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39173144

RESUMO

BACKGROUND: With the aging global population and the rising burden of Alzheimer disease and related dementias (ADRDs), there is a growing focus on identifying mild cognitive impairment (MCI) to enable timely interventions that could potentially slow down the onset of clinical dementia. The production of speech by an individual is a cognitively complex task that engages various cognitive domains. The ease of audio data collection highlights the potential cost-effectiveness and noninvasive nature of using human speech as a tool for cognitive assessment. OBJECTIVE: This study aimed to construct a machine learning pipeline that incorporates speaker diarization, feature extraction, feature selection, and classification to identify a set of acoustic features derived from voice recordings that exhibit strong MCI detection capability. METHODS: The study included 100 MCI cases and 100 cognitively normal controls matched for age, sex, and education from the Framingham Heart Study. Participants' spoken responses on neuropsychological tests were recorded, and the recorded audio was processed to identify segments of each participant's voice from recordings that included voices of both testers and participants. A comprehensive set of 6385 acoustic features was then extracted from these voice segments using OpenSMILE and Praat software. Subsequently, a random forest model was constructed to classify cognitive status using the features that exhibited significant differences between the MCI and cognitively normal groups. The MCI detection performance of various audio lengths was further examined. RESULTS: An optimal subset of 29 features was identified that resulted in an area under the receiver operating characteristic curve of 0.87, with a 95% CI of 0.81-0.94. The most important acoustic feature for MCI classification was the number of filled pauses (importance score=0.09, P=3.10E-08). There was no substantial difference in the performance of the model trained on the acoustic features derived from different lengths of voice recordings. CONCLUSIONS: This study showcases the potential of monitoring changes to nonsemantic and acoustic features of speech as a way of early ADRD detection and motivates future opportunities for using human speech as a measure of brain health.


Assuntos
Disfunção Cognitiva , Humanos , Disfunção Cognitiva/diagnóstico , Disfunção Cognitiva/fisiopatologia , Feminino , Masculino , Idoso , Voz/fisiologia , Aprendizado de Máquina , Testes Neuropsicológicos , Pessoa de Meia-Idade , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , Acústica da Fala
13.
J Acoust Soc Am ; 156(2): 1283-1308, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39172710

RESUMO

Sound for the human voice is produced by vocal fold flow-induced vibration and involves a complex coupling between flow dynamics, tissue motion, and acoustics. Over the past three decades, synthetic, self-oscillating vocal fold models have played an increasingly important role in the study of these complex physical interactions. In particular, two types of models have been established: "membranous" vocal fold models, such as a water-filled latex tube, and "elastic solid" models, such as ultrasoft silicone formed into a vocal fold-like shape and in some cases with multiple layers of differing stiffness to mimic the human vocal fold tissue structure. In this review, the designs, capabilities, and limitations of these two types of models are presented. Considerations unique to the implementation of elastic solid models, including fabrication processes and materials, are discussed. Applications in which these models have been used to study the underlying mechanical principles that govern phonation are surveyed, and experimental techniques and configurations are reviewed. Finally, recommendations for continued development of these models for even more lifelike response and clinical relevance are summarized.


Assuntos
Fonação , Vibração , Prega Vocal , Prega Vocal/fisiologia , Prega Vocal/anatomia & histologia , Humanos , Modelos Anatômicos , Fenômenos Biomecânicos , Voz/fisiologia , Elasticidade , Modelos Biológicos
14.
PLoS One ; 19(8): e0306866, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39146267

RESUMO

Low-dimensional materials have demonstrated strong potential for use in diverse flexible strain sensors for wearable electronic device applications. However, the limited contact area in the sensing layer, caused by the low specific surface area of typical nanomaterials, hinders the pursuit of high-performance strain-sensor applications. Herein, we report an efficient method for synthesizing TiO2-based nanocomposite materials by directly using industrial raw materials with ultrahigh specific surface areas that can be used for strain sensors. A kinetic study of the self-seeded thermal hydrolysis sulfate process was conducted for the controllable synthesis of pure TiO2 and related TiO2/graphene composites. The hydrolysis readily modified the crystal form and morphology of the prepared TiO2 nanoparticles, and the prepared composite samples possessed a uniform nanoporous structure. Experiments demonstrated that the TiO2/graphene composite can be used in strain sensors with a maximum Gauge factor of 252. In addition, the TiO2/graphene composite-based strain sensor showed high stability by continuously operating over 1,000 loading cycles and aging tests over three months. It also shows that the fabricated strain sensors have the potential for human voice recognition by characterizing letters, words, and musical tones.


Assuntos
Grafite , Nanocompostos , Titânio , Titânio/química , Grafite/química , Humanos , Nanocompostos/química , Voz , Dispositivos Eletrônicos Vestíveis
15.
J Med Internet Res ; 26: e57258, 2024 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-39110963

RESUMO

BACKGROUND: The integration of smart technologies, including wearables and voice-activated devices, is increasingly recognized for enhancing the independence and well-being of older adults. However, the long-term dynamics of their use and the coadaptation process with older adults remain poorly understood. This scoping review explores how interactions between older adults and smart technologies evolve over time to improve both user experience and technology utility. OBJECTIVE: This review synthesizes existing research on the coadaptation between older adults and smart technologies, focusing on longitudinal changes in use patterns, the effectiveness of technological adaptations, and the implications for future technology development and deployment to improve user experiences. METHODS: Following the Joanna Briggs Institute Reviewer's Manual and PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines, this scoping review examined peer-reviewed papers from databases including Ovid MEDLINE, Ovid Embase, PEDro, Ovid PsycINFO, and EBSCO CINAHL from the year 2000 to August 28, 2023, and included forward and backward searches. The search was updated on March 1, 2024. Empirical studies were included if they involved (1) individuals aged 55 years or older living independently and (2) focused on interactions and adaptations between older adults and wearables and voice-activated virtual assistants in interventions for a minimum period of 8 weeks. Data extraction was informed by the selection and optimization with compensation framework and the sex- and gender-based analysis plus theoretical framework and used a directed content analysis approach. RESULTS: The search yielded 16,143 papers. Following title and abstract screening and a full-text review, 5 papers met the inclusion criteria. Study populations were mostly female participants and aged 73-83 years from the United States and engaged with voice-activated virtual assistants accessed through smart speakers and wearables. Users frequently used simple commands related to music and weather, integrating devices into daily routines. However, communication barriers often led to frustration due to devices' inability to recognize cues or provide personalized responses. The findings suggest that while older adults can integrate smart technologies into their lives, a lack of customization and user-friendly interfaces hinder long-term adoption and satisfaction. The studies highlight the need for technology to be further developed so they can better meet this demographic's evolving needs and call for research addressing small sample sizes and limited diversity. CONCLUSIONS: Our findings highlight a critical need for continued research into the dynamic and reciprocal relationship between smart technologies and older adults over time. Future studies should focus on more diverse populations and extend monitoring periods to provide deeper insights into the coadaptation process. Insights gained from this review are vital for informing the development of more intuitive, user-centric smart technology solutions to better support the aging population in maintaining independence and enhancing their quality of life. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/51129.


Assuntos
Dispositivos Eletrônicos Vestíveis , Humanos , Idoso , Pessoa de Meia-Idade , Feminino , Masculino , Idoso de 80 Anos ou mais , Voz , Estudos Longitudinais
16.
J Acoust Soc Am ; 156(2): 922-938, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39133041

RESUMO

Voices arguably occupy a superior role in auditory processing. Specifically, studies have reported that singing voices are processed faster and more accurately and possess greater salience in musical scenes compared to instrumental sounds. However, the underlying acoustic features of this superiority and the generality of these effects remain unclear. This study investigates the impact of frequency micro-modulations (FMM) and the influence of interfering sounds on sound recognition. Thirty young participants, half with musical training, engage in three sound recognition experiments featuring short vocal and instrumental sounds in a go/no-go task. Accuracy and reaction times are measured for sounds from recorded samples and excerpts of popular music. Each sound is presented in separate versions with and without FMM, in isolation or accompanied by a piano. Recognition varies across sound categories, but no general vocal superiority emerges and no effects of FMM. When presented together with interfering sounds, all sounds exhibit degradation in recognition. However, whereas /a/ sounds stand out by showing a distinct robustness to interference (i.e., less degradation of recognition), /u/ sounds lack this robustness. Acoustical analysis implies that recognition differences can be explained by spectral similarities. Together, these results challenge the notion of general vocal superiority in auditory perception.


Assuntos
Estimulação Acústica , Percepção Auditiva , Música , Reconhecimento Psicológico , Humanos , Masculino , Feminino , Adulto Jovem , Adulto , Estimulação Acústica/métodos , Percepção Auditiva/fisiologia , Tempo de Reação , Canto , Voz/fisiologia , Adolescente , Espectrografia do Som , Qualidade da Voz
17.
J Exp Psychol Hum Percept Perform ; 50(9): 918-933, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39101929

RESUMO

Affective stimuli in our environment indicate reward or threat and thereby relate to approach and avoidance behavior. Previous findings suggest that affective stimuli may bias visual perception, but it remains unclear whether similar biases exist in the auditory domain. Therefore, we asked whether affective auditory voices (angry vs. neutral) influence sound distance perception. Two VR experiments (data collection 2021-2022) were conducted in which auditory stimuli were presented via loudspeakers located at positions unknown to the participants. In the first experiment (N = 44), participants actively placed a visually presented virtual agent or virtual loudspeaker in an empty room at the perceived sound source location. In the second experiment (N = 32), participants were standing in front of several virtual agents or virtual loudspeakers and had to indicate the sound source by directing their gaze toward the perceived sound location. Results in both preregistered experiments consistently showed that participants estimated the location of angry voice stimuli at greater distances than the location of neutral voice stimuli. We discuss that neither emotional nor motivational biases can account for these results. Instead, distance estimates seem to rely on listeners' representations regarding the relationship between vocal affect and acoustic characteristics. (PsycInfo Database Record (c) 2024 APA, all rights reserved).


Assuntos
Afeto , Humanos , Adulto , Feminino , Masculino , Adulto Jovem , Afeto/fisiologia , Percepção de Distância/fisiologia , Localização de Som/fisiologia , Voz/fisiologia , Realidade Virtual , Ira/fisiologia , Percepção Auditiva/fisiologia
18.
Cognition ; 250: 105866, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38971020

RESUMO

Language experience confers a benefit to voice learning, a concept described in the literature as the language familiarity effect (LFE). What experiences are necessary for the LFE to be conferred is less clear. We contribute empirically and theoretically to this debate by examining within and across language voice learning with Cantonese-English bilingual voices in a talker-voice association paradigm. Listeners were trained in Cantonese or English and assessed on their abilities to generalize voice learning at test on Cantonese and English utterances. By testing listeners from four language backgrounds - English Monolingual, Cantonese-English Multilingual, Tone Multilingual, and Non-tone Multilingual groups - we assess whether the LFE and group-level differences in voice learning are due to varying abilities (1) in accessing the relative acoustic-phonetic features that distinguish a voice, (2) learning at a given rate, or (3) generalizing learning of talker-voice associations to novel same-language and different-language utterances. The specific four language background groups allow us to investigate the roles of language-specific familiarity, tone language experience, and generic multilingual experience in voice learning. Differences in performance across listener groups shows evidence in support of the LFE and the role of two mechanisms for voice learning: the extraction and association of talker-specific, language-general information that is more robustly generalized across languages, and talker-specific, language-specific information that may be more readily accessible and learnable, but due to its language-specific nature, is less able to be extended to another language.


Assuntos
Aprendizagem , Multilinguismo , Percepção da Fala , Voz , Humanos , Voz/fisiologia , Percepção da Fala/fisiologia , Feminino , Masculino , Aprendizagem/fisiologia , Adulto , Adulto Jovem , Idioma , Reconhecimento Psicológico/fisiologia , Fonética
19.
Hum Brain Mapp ; 45(10): e26724, 2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-39001584

RESUMO

Music is ubiquitous, both in its instrumental and vocal forms. While speech perception at birth has been at the core of an extensive corpus of research, the origins of the ability to discriminate instrumental or vocal melodies is still not well investigated. In previous studies comparing vocal and musical perception, the vocal stimuli were mainly related to speaking, including language, and not to the non-language singing voice. In the present study, to better compare a melodic instrumental line with the voice, we used singing as a comparison stimulus, to reduce the dissimilarities between the two stimuli as much as possible, separating language perception from vocal musical perception. In the present study, 45 newborns were scanned, 10 full-term born infants and 35 preterm infants at term-equivalent age (mean gestational age at test = 40.17 weeks, SD = 0.44) using functional magnetic resonance imaging while listening to five melodies played by a musical instrument (flute) or sung by a female voice. To examine the dynamic task-based effective connectivity, we employed a psychophysiological interaction of co-activation patterns (PPI-CAPs) analysis, using the auditory cortices as seed region, to investigate moment-to-moment changes in task-driven modulation of cortical activity during an fMRI task. Our findings reveal condition-specific, dynamically occurring patterns of co-activation (PPI-CAPs). During the vocal condition, the auditory cortex co-activates with the sensorimotor and salience networks, while during the instrumental condition, it co-activates with the visual cortex and the superior frontal cortex. Our results show that the vocal stimulus elicits sensorimotor aspects of the auditory perception and is processed as a more salient stimulus while the instrumental condition activated higher-order cognitive and visuo-spatial networks. Common neural signatures for both auditory stimuli were found in the precuneus and posterior cingulate gyrus. Finally, this study adds knowledge on the dynamic brain connectivity underlying the newborns capability of early and specialized auditory processing, highlighting the relevance of dynamic approaches to study brain function in newborn populations.


Assuntos
Percepção Auditiva , Imageamento por Ressonância Magnética , Música , Humanos , Feminino , Masculino , Percepção Auditiva/fisiologia , Recém-Nascido , Canto/fisiologia , Recém-Nascido Prematuro/fisiologia , Mapeamento Encefálico , Estimulação Acústica , Encéfalo/fisiologia , Encéfalo/diagnóstico por imagem , Voz/fisiologia
20.
J Acoust Soc Am ; 156(1): 278-283, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38980102

RESUMO

How we produce and perceive voice is constrained by laryngeal physiology and biomechanics. Such constraints may present themselves as principal dimensions in the voice outcome space that are shared among speakers. This study attempts to identify such principal dimensions in the voice outcome space and the underlying laryngeal control mechanisms in a three-dimensional computational model of voice production. A large-scale voice simulation was performed with parametric variations in vocal fold geometry and stiffness, glottal gap, vocal tract shape, and subglottal pressure. Principal component analysis was applied to data combining both the physiological control parameters and voice outcome measures. The results showed three dominant dimensions accounting for at least 50% of the total variance. The first two dimensions describe respiratory-laryngeal coordination in controlling the energy balance between low- and high-frequency harmonics in the produced voice, and the third dimension describes control of the fundamental frequency. The dominance of these three dimensions suggests that voice changes along these principal dimensions are likely to be more consistently produced and perceived by most speakers than other voice changes, and thus are more likely to have emerged during evolution and be used to convey important personal information, such as emotion and larynx size.


Assuntos
Laringe , Fonação , Análise de Componente Principal , Humanos , Fenômenos Biomecânicos , Laringe/fisiologia , Laringe/anatomia & histologia , Voz/fisiologia , Prega Vocal/fisiologia , Prega Vocal/anatomia & histologia , Simulação por Computador , Qualidade da Voz , Acústica da Fala , Pressão , Modelos Biológicos , Modelos Anatômicos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...