Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.164
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Neuroimage ; 285: 120483, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38048921

RESUMO

The integration of information from different sensory modalities is a fundamental process that enhances perception and performance in real and virtual environments (VR). Understanding these mechanisms, especially during learning tasks that exploit novel multisensory cue combinations provides opportunities for the development of new rehabilitative interventions. This study aimed to investigate how functional brain changes support behavioural performance improvements during an audio-visual (AV) learning task. Twenty healthy participants underwent a 30 min daily VR training for four weeks. The task was an AV adaptation of a 'scanning training' paradigm that is commonly used in hemianopia rehabilitation. Functional magnetic resonance imaging (fMRI) and performance data were collected at baseline, after two and four weeks of training, and four weeks post-training. We show that behavioural performance, operationalised as mean reaction time reduction in VR, significantly improves. In separate tests in a controlled laboratory environment, we showed that the behavioural performance gains in the VR training environment transferred to a significant mean RT reduction for the trained AV voluntary task on a computer screen. Enhancements were observed in both the visual-only and AV conditions, with the latter demonstrating a faster response time supported by the presence of audio cues. The behavioural learning effect also transfers to two additional tasks that were tested: a visual search task and an involuntary visual task. Our fMRI results reveal an increase in functional activation (BOLD signal) in multisensory brain regions involved in early-stage AV processing: the thalamus, the caudal inferior parietal lobe and cerebellum. These functional changes were only observed for the trained, multisensory, task and not for unimodal visual stimulation. Functional activation changes in the thalamus were significantly correlated to behavioural performance improvements. This study demonstrates that incorporating spatial auditory cues to voluntary visual training in VR leads to augmented brain activation changes in multisensory integration, resulting in measurable performance gains across tasks. The findings highlight the potential of VR-based multisensory training as an effective method for enhancing cognitive function and as a potentially valuable tool in rehabilitative programmes.


Assuntos
Imageamento por Ressonância Magnética , Realidade Virtual , Humanos , Aprendizagem , Encéfalo/fisiologia , Percepção Visual , Cegueira , Percepção Auditiva
2.
J Neurophysiol ; 131(6): 1311-1327, 2024 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-38718414

RESUMO

Tinnitus is the perception of a continuous sound in the absence of an external source. Although the role of the auditory system is well investigated, there is a gap in how multisensory signals are integrated to produce a single percept in tinnitus. Here, we train participants to learn a new sensory environment by associating a cue with a target signal that varies in perceptual threshold. In the test phase, we present only the cue to see whether the person perceives an illusion of the target signal. We perform two separate experiments to observe the behavioral and electrophysiological responses to the learning and test phases in 1) healthy young adults and 2) people with continuous subjective tinnitus and matched control subjects. We observed that in both parts of the study the percentage of false alarms was negatively correlated with the 75% detection threshold. Additionally, the perception of an illusion goes together with increased evoked response potential in frontal regions of the brain. Furthermore, in patients with tinnitus, we observe no significant difference in behavioral or evoked response in the auditory paradigm, whereas patients with tinnitus were more likely to report false alarms along with increased evoked activity during the learning and test phases in the visual paradigm. This emphasizes the importance of integrity of sensory pathways in multisensory integration and how this process may be disrupted in people with tinnitus. Furthermore, the present study also presents preliminary data supporting evidence that tinnitus patients may be building stronger perceptual models, which needs future studies with a larger population to provide concrete evidence on.NEW & NOTEWORTHY Tinnitus is the continuous phantom perception of a ringing in the ears. Recently, it has been suggested that tinnitus may be a maladaptive inference of the brain to auditory anomalies, whether they are detected or undetected by an audiogram. The present study presents empirical evidence for this hypothesis by inducing an illusion in a sensory domain that is damaged (auditory) and one that is intact (visual). It also presents novel information about how people with tinnitus process multisensory stimuli in the audio-visual domain.


Assuntos
Percepção Auditiva , Teorema de Bayes , Ilusões , Zumbido , Humanos , Zumbido/fisiopatologia , Projetos Piloto , Masculino , Feminino , Adulto , Percepção Auditiva/fisiologia , Ilusões/fisiologia , Percepção Visual/fisiologia , Adulto Jovem , Eletroencefalografia , Estimulação Acústica , Sinais (Psicologia)
3.
Eur J Neurosci ; 59(12): 3203-3223, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38637993

RESUMO

Social communication draws on several cognitive functions such as perception, emotion recognition and attention. The association of audio-visual information is essential to the processing of species-specific communication signals. In this study, we use functional magnetic resonance imaging in order to identify the subcortical areas involved in the cross-modal association of visual and auditory information based on their common social meaning. We identified three subcortical regions involved in audio-visual processing of species-specific communicative signals: the dorsolateral amygdala, the claustrum and the pulvinar. These regions responded to visual, auditory congruent and audio-visual stimulations. However, none of them was significantly activated when the auditory stimuli were semantically incongruent with the visual context, thus showing an influence of visual context on auditory processing. For example, positive vocalization (coos) activated the three subcortical regions when presented in the context of positive facial expression (lipsmacks) but not when presented in the context of negative facial expression (aggressive faces). In addition, the medial pulvinar and the amygdala presented multisensory integration such that audiovisual stimuli resulted in activations that were significantly higher than those observed for the highest unimodal response. Last, the pulvinar responded in a task-dependent manner, along a specific spatial sensory gradient. We propose that the dorsolateral amygdala, the claustrum and the pulvinar belong to a multisensory network that modulates the perception of visual socioemotional information and vocalizations as a function of the relevance of the stimuli in the social context. SIGNIFICANCE STATEMENT: Understanding and correctly associating socioemotional information across sensory modalities, such that happy faces predict laughter and escape scenes predict screams, is essential when living in complex social groups. With the use of functional magnetic imaging in the awake macaque, we identify three subcortical structures-dorsolateral amygdala, claustrum and pulvinar-that only respond to auditory information that matches the ongoing visual socioemotional context, such as hearing positively valenced coo calls and seeing positively valenced mutual grooming monkeys. We additionally describe task-dependent activations in the pulvinar, organizing along a specific spatial sensory gradient, supporting its role as a network regulator.


Assuntos
Tonsila do Cerebelo , Percepção Auditiva , Claustrum , Imageamento por Ressonância Magnética , Pulvinar , Percepção Visual , Pulvinar/fisiologia , Tonsila do Cerebelo/fisiologia , Tonsila do Cerebelo/diagnóstico por imagem , Masculino , Animais , Percepção Auditiva/fisiologia , Claustrum/fisiologia , Percepção Visual/fisiologia , Feminino , Expressão Facial , Macaca , Estimulação Luminosa/métodos , Mapeamento Encefálico , Estimulação Acústica , Vocalização Animal/fisiologia , Percepção Social
4.
Hum Brain Mapp ; 45(11): e26797, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39041175

RESUMO

Speech comprehension is crucial for human social interaction, relying on the integration of auditory and visual cues across various levels of representation. While research has extensively studied multisensory integration (MSI) using idealised, well-controlled stimuli, there is a need to understand this process in response to complex, naturalistic stimuli encountered in everyday life. This study investigated behavioural and neural MSI in neurotypical adults experiencing audio-visual speech within a naturalistic, social context. Our novel paradigm incorporated a broader social situational context, complete words, and speech-supporting iconic gestures, allowing for context-based pragmatics and semantic priors. We investigated MSI in the presence of unimodal (auditory or visual) or complementary, bimodal speech signals. During audio-visual speech trials, compared to unimodal trials, participants more accurately recognised spoken words and showed a more pronounced suppression of alpha power-an indicator of heightened integration load. Importantly, on the neural level, these effects surpassed mere summation of unimodal responses, suggesting non-linear MSI mechanisms. Overall, our findings demonstrate that typically developing adults integrate audio-visual speech and gesture information to facilitate speech comprehension in noisy environments, highlighting the importance of studying MSI in ecologically valid contexts.


Assuntos
Gestos , Percepção da Fala , Humanos , Feminino , Masculino , Percepção da Fala/fisiologia , Adulto Jovem , Adulto , Percepção Visual/fisiologia , Eletroencefalografia , Compreensão/fisiologia , Estimulação Acústica , Fala/fisiologia , Encéfalo/fisiologia , Estimulação Luminosa/métodos
5.
Glob Chang Biol ; 30(1): e17056, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38273542

RESUMO

Ecosystem functions and services are severely threatened by unprecedented global loss in biodiversity. To counteract these trends, it is essential to develop systems to monitor changes in biodiversity for planning, evaluating, and implementing conservation and mitigation actions. However, the implementation of monitoring systems suffers from a trade-off between grain (i.e., the level of detail), extent (i.e., the number of study sites), and temporal repetition. Here, we present an applied and realized networked sensor system for integrated biodiversity monitoring in the Nature 4.0 project as a solution to these challenges, which considers plants and animals not only as targets of investigation, but also as parts of the modular sensor network by carrying sensors. Our networked sensor system consists of three main closely interlinked components with a modular structure: sensors, data transmission, and data storage, which are integrated into pipelines for automated biodiversity monitoring. We present our own real-world examples of applications, share our experiences in operating them, and provide our collected open data. Our flexible, low-cost, and open-source solutions can be applied for monitoring individual and multiple terrestrial plants and animals as well as their interactions. Ultimately, our system can also be applied to area-wide ecosystem mapping tasks, thereby providing an exemplary cost-efficient and powerful solution for biodiversity monitoring. Building upon our experiences in the Nature 4.0 project, we identified ten key challenges that need to be addressed to better understand and counteract the ongoing loss of biodiversity using networked sensor systems. To tackle these challenges, interdisciplinary collaboration, additional research, and practical solutions are necessary to enhance the capability and applicability of networked sensor systems for researchers and practitioners, ultimately further helping to ensure the sustainable management of ecosystems and the provision of ecosystem services.


Assuntos
Conservação dos Recursos Naturais , Ecossistema , Animais , Biodiversidade , Plantas
6.
Stat Med ; 2024 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-39237100

RESUMO

From early in the coronavirus disease 2019 (COVID-19) pandemic, there was interest in using machine learning methods to predict COVID-19 infection status based on vocal audio signals, for example, cough recordings. However, early studies had limitations in terms of data collection and of how the performances of the proposed predictive models were assessed. This article describes how these limitations have been overcome in a study carried out by the Turing-RSS Health Data Laboratory and the UK Health Security Agency. As part of the study, the UK Health Security Agency collected a dataset of acoustic recordings, SARS-CoV-2 infection status and extensive study participant meta-data. This allowed us to rigorously assess state-of-the-art machine learning techniques to predict SARS-CoV-2 infection status based on vocal audio signals. The lessons learned from this project should inform future studies on statistical evaluation methods to assess the performance of machine learning techniques for public health tasks.

7.
Dev Sci ; 27(2): e13436, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37551932

RESUMO

The environment in which infants learn language is multimodal and rich with social cues. Yet, the effects of such cues, such as eye contact, on early speech perception have not been closely examined. This study assessed the role of ostensive speech, signalled through the speaker's eye gaze direction, on infants' word segmentation abilities. A familiarisation-then-test paradigm was used while electroencephalography (EEG) was recorded. Ten-month-old Dutch-learning infants were familiarised with audio-visual stories in which a speaker recited four sentences with one repeated target word. The speaker addressed them either with direct or with averted gaze while speaking. In the test phase following each story, infants heard familiar and novel words presented via audio-only. Infants' familiarity with the words was assessed using event-related potentials (ERPs). As predicted, infants showed a negative-going ERP familiarity effect to the isolated familiarised words relative to the novel words over the left-frontal region of interest during the test phase. While the word familiarity effect did not differ as a function of the speaker's gaze over the left-frontal region of interest, there was also a (not predicted) positive-going early ERP familiarity effect over right fronto-central and central electrodes in the direct gaze condition only. This study provides electrophysiological evidence that infants can segment words from audio-visual speech, regardless of the ostensiveness of the speaker's communication. However, the speaker's gaze direction seems to influence the processing of familiar words. RESEARCH HIGHLIGHTS: We examined 10-month-old infants' ERP word familiarity response using audio-visual stories, in which a speaker addressed infants with direct or averted gaze while speaking. Ten-month-old infants can segment and recognise familiar words from audio-visual speech, indicated by their negative-going ERP response to familiar, relative to novel, words. This negative-going ERP word familiarity effect was present for isolated words over left-frontal electrodes regardless of whether the speaker offered eye contact while speaking. An additional positivity in response to familiar words was observed for direct gaze only, over right fronto-central and central electrodes.


Assuntos
Percepção da Fala , Fala , Lactente , Humanos , Fala/fisiologia , Fixação Ocular , Idioma , Potenciais Evocados/fisiologia , Percepção da Fala/fisiologia
8.
Cereb Cortex ; 33(8): 4740-4751, 2023 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-36178127

RESUMO

Human language units are hierarchical, and reading acquisition involves integrating multisensory information (typically from auditory and visual modalities) to access meaning. However, it is unclear how the brain processes and integrates language information at different linguistic units (words, phrases, and sentences) provided simultaneously in auditory and visual modalities. To address the issue, we presented participants with sequences of short Chinese sentences through auditory, visual, or combined audio-visual modalities while electroencephalographic responses were recorded. With a frequency tagging approach, we analyzed the neural representations of basic linguistic units (i.e. characters/monosyllabic words) and higher-level linguistic structures (i.e. phrases and sentences) across the 3 modalities separately. We found that audio-visual integration occurs in all linguistic units, and the brain areas involved in the integration varied across different linguistic levels. In particular, the integration of sentences activated the local left prefrontal area. Therefore, we used continuous theta-burst stimulation to verify that the left prefrontal cortex plays a vital role in the audio-visual integration of sentence information. Our findings suggest the advantage of bimodal language comprehension at hierarchical stages in language-related information processing and provide evidence for the causal role of the left prefrontal regions in processing information of audio-visual sentences.


Assuntos
Mapeamento Encefálico , Compreensão , Humanos , Compreensão/fisiologia , Encéfalo/fisiologia , Linguística , Eletroencefalografia
9.
BMC Nephrol ; 25(1): 262, 2024 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-39143571

RESUMO

BACKGROUND: Adherence to diet is effective for metabolic control in patients on hemodialysis. There are educational pamphlets or booklets to improve patients' knowledge about healthy diets. As video presentation is more desirable than the presentation of readable materials, we designed an educational video on healthy diets in renal failure patients who was played during several sessions of hemodialysis. We compared the effect of this modality on the knowledge, attitudes and metabolic control of the patients before and after the intervention. METHODS: In this interventional study, all the patients who were referred to the hemodialysis ward at Ashrafi-Esfahani Medical Center (Tehran, Iran) between May 2018 and March 2019 were enrolled (N = 190). Totally, 130 patients had inclusion criteria. An educational video about a healthy diet was shown seven times (once a week in the first month, once every two weeks in the second month, and once in the third month) during hemodialysis for the patients. The nephrologist prepared a video in the form of a lecture with graphic images for 20 min based on the healthy nutrition of the Kidney Federation of Iran's Guide for hemodialysis patients. The questionnaire was completed in terms of awareness and attitudes, and blood and urine tests were performed at the 1st, 3rd, and 12th months. Serum parameters, including electrolytes, lipid profile, CBC-diff, dialysis efficacy (Kt/V), and the URR (urine filtration rate) were examined. Pre and post intervention values were compared via the statistical analysis performed using IBM SPSS. P-Value < 0.05 was significant. RESULTS: The data of 128 people were analyzed at the end of the study. 55% of patients were 10-40 years old and 60% were male. 56% of patients were illiterate or had an elementary school education. The most common underlying diseases were hypertension and diabetes mellitus. Ten to 19% of participants had enough knowledge about the various components of a healthy diet for patients on hemodialysis. Approximately 25%, 14%, and 45% of the participants consumed a healthy diet for breakfast, lunch and dinner, respectively. A comparison of the mean values of the serum parameters before and after the intervention revealed significant changes in phosphorus, blood urea nitrogen, and hemoglobin with mean differences of -118.41 ± 22.84, 21.51 ± 10.38 (both P < 0.001), and 0.29 ± 1.18 (P = 0.044), respectively. The mean Kt/V was similar at all phases. CONCLUSION: The use of an educational video was effective for normalizing the metabolic parameters in patients under hemodialysis and can be an appropriate option, especially for illiterate patients. TRIAL REGISTRATION: IRCT2016082229481N1.


Assuntos
Dieta Saudável , Conhecimentos, Atitudes e Prática em Saúde , Educação de Pacientes como Assunto , Diálise Renal , Gravação em Vídeo , Humanos , Masculino , Feminino , Educação de Pacientes como Assunto/métodos , Pessoa de Meia-Idade , Seguimentos , Adulto , Falência Renal Crônica/terapia , Idoso , Irã (Geográfico)
10.
Paediatr Anaesth ; 34(7): 665-670, 2024 07.
Artigo em Inglês | MEDLINE | ID: mdl-38661287

RESUMO

BACKGROUND: The purpose of this study is to provide comprehensive and efficient pre-anesthesia counseling (PAC) utilizing audiovisual aids and to examine their effect on parental anxiety. METHODS: For this prospective, controlled study, 174 parents were recruited and randomized into three groups of 58 (Group A: video, Group B: brochure, and Group C: verbal). During pre-anesthesia counseling, the parent was provided with a detailed explanation of preoperative preparation, fasting instructions, transport to the operating room, induction, the emergence of anesthesia, and nursing in the post-anesthesia care unit based on their assigned group. We evaluated parental anxiety using Spielberger's State-Trait Anxiety Inventory before and after the pre-anesthesia counseling. RESULTS: The results of our study show a statistically significant difference in the final mean STAI scores among the three groups (Group A: 34.69 ± 5.31, Group B: 36.34 ± 8.59, and Group C: 43.59 ± 3.39; p < .001). When compared to the brochure and verbal groups, the parents in the video group have the greatest difference in mean baseline and final Spielberger's State-Trait Anxiety Inventory scores (12.207 ± 5.291, p .001). CONCLUSION: The results of our study suggest that pre-anesthesia counseling by video or a brochure before the day of surgery is associated with a higher reduction in parental anxiety when compared to verbal communication.


Assuntos
Ansiedade , Comunicação , Aconselhamento , Folhetos , Pais , Cuidados Pré-Operatórios , Humanos , Ansiedade/prevenção & controle , Ansiedade/psicologia , Pais/psicologia , Feminino , Cuidados Pré-Operatórios/métodos , Masculino , Estudos Prospectivos , Aconselhamento/métodos , Anestesia/métodos , Gravação em Vídeo , Recursos Audiovisuais , Adulto , Criança , Pré-Escolar
11.
BMC Surg ; 24(1): 167, 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38807080

RESUMO

BACKGROUND: To explore the application effect of 3D printing surgical training models in the preoperative assessment of robot-assisted partial nephrectomy. METHODS: Eighty patients who underwent robot-assisted partial nephrectomy surgery between January 2022 and December 2023 were selected and divided into two groups according to the chronological order. The control group (n = 40) received preoperative assessment with verbal and video education from January 2022 to December 2022, while the observation group (n = 40) received preoperative assessment with 3D printing surgical training models combined with verbal and video education from January 2023 to December 2023. The preoperative anxiety, information demand score, and surgical awareness were compared between the two groups. The physiological stress indicators, including interleukin-6 (IL-6), angiotensin II (AT II), adrenocorticotropic hormone (ACTH), cortisol (Cor), mean arterial pressure (MAP), and heart rate (HR), were also measured at different time points before and after surgery.They were 6:00 am on the day before surgery (T0), 6:00 am on the day of the operation (T1), 6:00 am on the first day after the operation (T2), and 6:00 am on the third day after the operation (T3).The preparation rate before surgery was compared between the two groups. RESULTS: The anxiety and surgical information demand scores were lower in the observation group than in the control group before anesthesia induction, and the difference was statistically significant (P < 0.001). Both groups had lower scores before anesthesia induction than before preoperative assessment, and the difference was statistically significant (P < 0.05). The physiological stress indicators at T1 time points were lower in the observation group than in the control group, and the difference was statistically significant (P < 0.05). The overall means of the physiological stress indicators differed significantly between the two groups (P < 0.001). Compared with the T0 time point, the T1, T2, and T3 time points in both groups were significantly lower, and the difference was statistically significant (P < 0.05). The surgical awareness and preparation rate before surgery were higher in the observation group than in the control group, and the difference was statistically significant (P < 0.05). CONCLUSION: The preoperative assessment mode using 3D printing surgical training models combined with verbal and video education can effectively reduce the psychological and physiological stress responses of surgical patients, improve their surgical awareness, and enhance the preparation rate before surgery.


Assuntos
Nefrectomia , Impressão Tridimensional , Procedimentos Cirúrgicos Robóticos , Humanos , Nefrectomia/métodos , Nefrectomia/educação , Procedimentos Cirúrgicos Robóticos/educação , Feminino , Masculino , Pessoa de Meia-Idade , Cuidados Pré-Operatórios/métodos , Adulto , Idoso , Modelos Anatômicos
12.
BMC Med Educ ; 24(1): 560, 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38783278

RESUMO

BACKGROUND: Cardiac auscultation is an efficient and effective diagnostic tool, especially in low-income countries where access to modern diagnostic methods remains difficult. This study aimed to evaluate the effect of a digitally enhanced cardiac auscultation learning method on medical students' performance and satisfaction. METHODS: We conducted a double-arm parallel controlled trial, including newly admitted 4th -year medical students enrolled in two medical schools in Yaoundé, Cameroon and allocated into two groups: the intervention group (benefiting from theoretical lessons, clinical internship and the listening sessions of audio recordings of heart sounds) and the control group (benefiting from theoretical lessons and clinical internship). All the participants were subjected to a pretest before the beginning of the training, evaluating theoretical knowledge and recognition of cardiac sounds, and a post-test at the eighth week of clinical training associated with the evaluation of satisfaction. The endpoints were the progression of knowledge score, skills score, total (knowledge and skills) score and participant satisfaction. RESULTS: Forty-nine participants (27 in the intervention group and 22 in the control group) completed the study. The knowledge progression (+ 26.7 versus + 7.5; p ˂0.01) and the total progression (+ 22.5 versus + 14.6; p ˂ 0.01) were higher in the intervention group with a statistically significant difference compared to the control group. There was no significant difference between the two groups regarding skills progression (+ 25 versus + 17.5; p = 0.27). Satisfaction was higher in general in the intervention group (p ˂ 0.01), which recommended this method compared to the control group. CONCLUSION: The learning method of cardiac auscultation reinforced by the listening sessions of audio recordings of heart sounds improves medical students' performances (knowledge and global - knowledge and skills) who find it satisfactory and recommendable. TRIAL REGISTRATION: This trial has been registered the 29/11/2019 in the Pan African Clinical Trials Registry ( http://www.pactr.org ) under unique identification number PACTR202001504666847 and the protocol has been published in BMC Medical Education.


Assuntos
Competência Clínica , Auscultação Cardíaca , Estudantes de Medicina , Humanos , Camarões , Masculino , Feminino , Avaliação Educacional/métodos , Educação de Graduação em Medicina/métodos , Adulto Jovem , Instrução por Computador/métodos
13.
Sensors (Basel) ; 24(11)2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38894478

RESUMO

Identification of different species of animals has become an important issue in biology and ecology. Ornithology has made alliances with other disciplines in order to establish a set of methods that play an important role in the birds' protection and the evaluation of the environmental quality of different ecosystems. In this case, the use of machine learning and deep learning techniques has produced big progress in birdsong identification. To make an approach from AI-IoT, we have used different approaches based on image feature comparison (through CNNs trained with Imagenet weights, such as EfficientNet or MobileNet) using the feature spectrogram for the birdsong, but also the use of the deep CNN (DCNN) has shown good performance for birdsong classification for reduction of the model size. A 5G IoT-based system for raw audio gathering has been developed, and different CNNs have been tested for bird identification from audio recordings. This comparison shows that Imagenet-weighted CNN shows a relatively high performance for most species, achieving 75% accuracy. However, this network contains a large number of parameters, leading to a less energy efficient inference. We have designed two DCNNs to reduce the amount of parameters, to keep the accuracy at a certain level, and to allow their integration into a small board computer (SBC) or a microcontroller unit (MCU).


Assuntos
Aves , Redes Neurais de Computação , Vocalização Animal , Animais , Aves/fisiologia , Aves/classificação , Vocalização Animal/fisiologia , Aprendizado de Máquina , Internet das Coisas , Inteligência Artificial , Aprendizado Profundo , Algoritmos
14.
Sensors (Basel) ; 24(9)2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38732857

RESUMO

This study presents a pioneering approach that leverages advanced sensing technologies and data processing techniques to enhance the process of clinical documentation generation during medical consultations. By employing sophisticated sensors to capture and interpret various cues such as speech patterns, intonations, or pauses, the system aims to accurately perceive and understand patient-doctor interactions in real time. This sensing capability allows for the automation of transcription and summarization tasks, facilitating the creation of concise and informative clinical documents. Through the integration of automatic speech recognition sensors, spoken dialogue is seamlessly converted into text, enabling efficient data capture. Additionally, deep models such as Transformer models are utilized to extract and analyze crucial information from the dialogue, ensuring that the generated summaries encapsulate the essence of the consultations accurately. Despite encountering challenges during development, experimentation with these sensing technologies has yielded promising results. The system achieved a maximum ROUGE-1 metric score of 0.57, demonstrating its effectiveness in summarizing complex medical discussions. This sensor-based approach aims to alleviate the administrative burden on healthcare professionals by automating documentation tasks and safeguarding important patient information. Ultimately, by enhancing the efficiency and reliability of clinical documentation, this innovative method contributes to improving overall healthcare outcomes.


Assuntos
Aprendizado Profundo , Humanos , Interface para o Reconhecimento da Fala
15.
Sensors (Basel) ; 24(6)2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38544136

RESUMO

The ubiquity of smartphones today enables the widespread utilization of voice recording for diverse purposes. Consequently, the submission of voice recordings as digital evidence in legal proceedings has notably increased, alongside a rise in allegations of recording file forgery. This trend highlights the growing significance of audio file authentication. This study aims to develop a deep learning methodology capable of identifying forged files, particularly those altered using "Mixed Paste" commands, a technique not previously addressed. The proposed deep learning framework is a composite model, integrating a convolutional neural network and a long short-term memory model. It is designed based on the extraction of features from spectrograms and sequences of Korean consonant types. The training of this model utilizes an authentic dataset of forged audio recordings created on an iPhone, modified via "Mixed Paste", and encoded. This hybrid model demonstrates a high accuracy rate of 97.5%. To validate the model's efficacy, tests were conducted using various manipulated audio files. The findings reveal that the model's effectiveness is not contingent on the smartphone model or the audio editing software employed. We anticipate that this research will advance the field of audio forensics through a novel hybrid model approach.

16.
Sensors (Basel) ; 24(4)2024 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-38400330

RESUMO

Respiratory diseases represent a significant global burden, necessitating efficient diagnostic methods for timely intervention. Digital biomarkers based on audio, acoustics, and sound from the upper and lower respiratory system, as well as the voice, have emerged as valuable indicators of respiratory functionality. Recent advancements in machine learning (ML) algorithms offer promising avenues for the identification and diagnosis of respiratory diseases through the analysis and processing of such audio-based biomarkers. An ever-increasing number of studies employ ML techniques to extract meaningful information from audio biomarkers. Beyond disease identification, these studies explore diverse aspects such as the recognition of cough sounds amidst environmental noise, the analysis of respiratory sounds to detect respiratory symptoms like wheezes and crackles, as well as the analysis of the voice/speech for the evaluation of human voice abnormalities. To provide a more in-depth analysis, this review examines 75 relevant audio analysis studies across three distinct areas of concern based on respiratory diseases' symptoms: (a) cough detection, (b) lower respiratory symptoms identification, and (c) diagnostics from the voice and speech. Furthermore, publicly available datasets commonly utilized in this domain are presented. It is observed that research trends are influenced by the pandemic, with a surge in studies on COVID-19 diagnosis, mobile data acquisition, and remote diagnosis systems.


Assuntos
Inteligência Artificial , COVID-19 , Humanos , COVID-19/diagnóstico , Tosse/diagnóstico , Tosse/fisiopatologia , Sons Respiratórios/diagnóstico , Sons Respiratórios/fisiopatologia , Aprendizado de Máquina , Doenças Respiratórias/diagnóstico , SARS-CoV-2/isolamento & purificação , Algoritmos , Voz/fisiologia
17.
Sensors (Basel) ; 24(7)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38610401

RESUMO

In recent years, headphones have become increasingly popular worldwide. There are numerous models on the market today, varying in technical characteristics and offering different listening experiences. This article presents an application for simulating the sound response of specific headphone models by physically wearing others. In the future, for example, this application could help to guide people who already own a pair of headphones during the decision-making process of purchasing a new headphone model. However, the potential fields of application are much broader. An in-depth study of digital signal processing was carried out with the implementation of a computational model. Prior to this, an analysis was performed on impulse response measurements of specific headphones, which allowed for a better understanding of the behavior of each set of headphones. Finally, an evaluation of the entire system was conducted through a listening test. The analysis of the results showed that the software works reasonably well in replicating the target headphones. We hope that this work will stimulate further efforts in the same direction.

18.
Sensors (Basel) ; 24(3)2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38339617

RESUMO

Across five studies, we present the preliminary technical validation of an infant-wearable platform, LittleBeats™, that integrates electrocardiogram (ECG), inertial measurement unit (IMU), and audio sensors. Each sensor modality is validated against data from gold-standard equipment using established algorithms and laboratory tasks. Interbeat interval (IBI) data obtained from the LittleBeats™ ECG sensor indicate acceptable mean absolute percent error rates for both adults (Study 1, N = 16) and infants (Study 2, N = 5) across low- and high-challenge sessions and expected patterns of change in respiratory sinus arrythmia (RSA). For automated activity recognition (upright vs. walk vs. glide vs. squat) using accelerometer data from the LittleBeats™ IMU (Study 3, N = 12 adults), performance was good to excellent, with smartphone (industry standard) data outperforming LittleBeats™ by less than 4 percentage points. Speech emotion recognition (Study 4, N = 8 adults) applied to LittleBeats™ versus smartphone audio data indicated a comparable performance, with no significant difference in error rates. On an automatic speech recognition task (Study 5, N = 12 adults), the best performing algorithm yielded relatively low word error rates, although LittleBeats™ (4.16%) versus smartphone (2.73%) error rates were somewhat higher. Together, these validation studies indicate that LittleBeats™ sensors yield a data quality that is largely comparable to those obtained from gold-standard devices and established protocols used in prior research.


Assuntos
Postura , Caminhada , Adulto , Humanos , Movimento (Física) , Caminhada/fisiologia , Postura/fisiologia , Posição Ortostática , Algoritmos , Fenômenos Biomecânicos
19.
Sensors (Basel) ; 24(5)2024 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-38474986

RESUMO

This paper presents a low-power, high-gain integrator design that uses a cascode operational transconductance amplifier (OTA) with floating inverter-amplifier (FIA) assistance. Compared to a traditional cascode, the proposed integrator can achieve a gain of 80 dB, while reducing power consumption by 30%. Upon completing the analysis, the value of the FIA drive capacitor and clock scheme for the FIA-assisted OTA were obtained. To enhance the dynamic range (DR) and mitigate quantization noise, a tri-level quantizer was employed. The design of the feedback digital-to-analog converter (DAC) was simplified, as it does not use additional mismatch shaping techniques. A third-order, discrete-time delta-sigma modulator was designed and fabricated in a 0.18 µm complementary metal-oxide semiconductor (CMOS) process. It operated on a 1.8 V supply, consuming 221 µW with a 24 kHz bandwidth. The measured SNDR and DR were 90.9 dB and 95.3 dB, respectively.

20.
Sensors (Basel) ; 24(16)2024 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-39204867

RESUMO

In order to solve the problem of difficult separation of audio signals collected in pig environments, this study proposes an underdetermined blind source separation (UBSS) method based on sparsification theory. The audio signals obtained by mixing the audio signals of pigs in different states with different coefficients are taken as observation signals, and the mixing matrix is first estimated from the observation signals using the improved AP clustering method based on the "two-step method" of sparse component analysis (SCA), and then the audio signals of pigs are reconstructed by L1-paradigm separation. Five different types of pig audio are selected for experiments to explore the effects of duration and mixing matrix on the blind source separation algorithm by controlling the audio duration and mixing matrix, respectively. With three source signals and two observed signals, the reconstructed signal metrics corresponding to different durations and different mixing matrices perform well. The similarity coefficient is above 0.8, the average recovered signal-to-noise ratio is above 8 dB, and the normalized mean square error is below 0.02. The experimental results show that different audio durations and different mixing matrices have certain effects on the UBSS algorithm, so the recording duration and the spatial location of the recording device need to be considered in practical applications. Compared with the classical UBSS algorithm, the proposed algorithm outperforms the classical blind source separation algorithm in estimating the mixing matrix and separating the mixed audio, which improves the reconstruction quality.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA