Pesquisa | Portal Regional da BVS

1.

Video-based sympathetic arousal assessment via peripheral blood flow estimation.

Braun, Björn; McDuff, Daniel; Baltrusaitis, Tadas; Holz, Christian.

Biomed Opt Express ; 14(12): 6607-6628, 2023 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-38420320

RESUMO

Electrodermal activity (EDA) is considered a standard marker of sympathetic activity. However, traditional EDA measurement requires electrodes in steady contact with the skin. Can sympathetic arousal be measured using only an optical sensor, such as an RGB camera? This paper presents a novel approach to infer sympathetic arousal by measuring the peripheral blood flow on the face or hand optically. We contribute a self-recorded dataset of 21 participants, comprising synchronized videos of participants' faces and palms and gold-standard EDA and photoplethysmography (PPG) signals. Our results show that we can measure peripheral sympathetic responses that closely correlate with the ground truth EDA. We obtain median correlations of 0.57 to 0.63 between our inferred signals and the ground truth EDA using only videos of the participants' palms or foreheads or PPG signals from the foreheads or fingers. We also show that sympathetic arousal is best inferred from the forehead, finger, or palm.

2.

Using High-Fidelity Avatars to Advance Camera-Based Cardiac Pulse Measurement.

McDuff, Daniel; Hernandez, Javier; Liu, Xin; Wood, Erroll; Baltrusaitis, Tadas.

IEEE Trans Biomed Eng ; 69(8): 2646-2656, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-35171764

RESUMO

Non-contact physiological measurement has the potential to provide low-cost, non-invasive health monitoring. However, machine vision approaches are often limited by the availability and diversity of annotated video datasets resulting in poor generalization to complex real-life conditions. To address these challenges, this work proposes the use of synthetic avatars that display facial blood flow changes and allow for systematic generation of samples under a wide variety of conditions. Our results show that training on both simulated and real video data can lead to performance gains under challenging conditions. We show strong performance on three large benchmark datasets and improved robustness to skin type and motion. These results highlight the promise of synthetic data for training camera-based pulse measurement; however, further research and validation is needed to establish whether synthetic data alone could be sufficient for training models.

Assuntos

Frequência Cardíaca , Frequência Cardíaca/fisiologia , Movimento (Física)

3.

Synthetic Data for Multi-Parameter Camera-Based Physiological Sensing.

McDuff, Daniel; Liu, Xin; Hernandez, Javier; Wood, Erroll; Baltrusaitis, Tadas.

Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 3742-3748, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-34892050

RESUMO

Synthetic data is a powerful tool in training data hungry deep learning algorithms. However, to date, camera-based physiological sensing has not taken full advantage of these techniques. In this work, we leverage a high-fidelity synthetics pipeline for generating videos of faces with faithful blood flow and breathing patterns. We present systematic experiments showing how physiologically-grounded synthetic data can be used in training camera-based multi-parameter cardiopulmonary sensing. We provide empirical evidence that heart and breathing rate measurement accuracy increases with the number of synthetic avatars in the training set. Furthermore, training with avatars with darker skin types leads to better overall performance than training with avatars with lighter skin types. Finally, we discuss the opportunities that synthetics present in the domain of camera-based physiological sensing and limitations that need to be overcome.

Assuntos

Algoritmos , Aprendizado Profundo , Circulação Sanguínea , Face , Respiração

4.

An Artificial Intelligence Approach to the Assessment of Abnormal Lid Position.

Thomas, Peter B M; Gunasekera, Chrishan D; Kang, Swan; Baltrusaitis, Tadas.

Plast Reconstr Surg Glob Open ; 8(10): e3089, 2020 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-33173665

RESUMO

New artificial intelligence (AI) approaches to facial analysis show promise in the clinical evaluation of abnormal lid position. This could allow more naturalistic, quantitative, and automated assessment of lid position. The aim of this article was to determine whether OpenFace, an AI approach to real-time facial landmarking and analysis, can extract clinically useful measurements from images of patients before and after ptosis correction. Manual and AI-automated approaches to vertical palpebral aperture measurement of 128 eyes in pre- and postoperative full-face images of ptosis patients were compared in this study. Agreement in interpupillary distance to vertical palpebral aperture ratio between clinicians and an AI-based system was assessed. Image quality varied highly with interpupillary distance defined by a mean of 143.4 pixels (min = 60, max = 328, SD = 80.3 pixels). A Bland-Altman analysis suggests a good agreement between manual and AI analysis of vertical palpebral aperture (94.4% of measurements falling within 2 SDs of the mean). Correlation between the 2 methods yielded a Pearson's r(126) = 0.87 (P < 0.01) and r2 = 0.76. This feasibility study suggests that existing, open-source approaches to facial analysis can be applied to the clinical assessment of patients with abnormal lid position. The approach could be extended to further quantify clinical assessment of oculoplastic conditions.

5.

Attended End-to-end Architecture for Age Estimation from Facial Expression Videos.

Pei, Wenjie; Dibeklioglu, Hamdi; Baltrusaitis, Tadas; Tax, David M J.

IEEE Trans Image Process ; 2019 Oct 24.

Artigo em Inglês | MEDLINE | ID: mdl-31670671

RESUMO

The main challenges of age estimation from facial expression videos lie not only in the modeling of the static facial appearance, but also in the capturing of the temporal facial dynamics. Traditional techniques to this problem focus on constructing handcrafted features to explore the discriminative information contained in facial appearance and dynamics separately. This relies on sophisticated feature-refinement and framework-design. In this paper, we present an end-toend architecture for age estimation, called Spatially-Indexed Attention Model (SIAM), which is able to simultaneously learn both the appearance and dynamics of age from raw videos of facial expressions. Specifically, we employ convolutional neural networks to extract effective latent appearance representations and feed them into recurrent networks to model the temporal dynamics. More importantly, we propose to leverage attention models for salience detection in both the spatial domain for each single image and the temporal domain for the whole video as well. We design a specific spatially-indexed attention mechanism among the convolutional layers to extract the salient facial regions in each individual image, and a temporal attention layer to assign attention weights to each frame. This two-pronged approach not only improves the performance by allowing the model to focus on informative frames and facial areas, but it also offers an interpretable correspondence between the spatial facial regions as well as temporal frames, and the task of age estimation. We demonstrate the strong performance of our model in experiments on a large, gender-balanced database with 400 subjects with ages spanning from 8 to 76 years. Experiments reveal that our model exhibits significant superiority over the state-of-the-art methods given sufficient training data.

6.

Multimodal Machine Learning: A Survey and Taxonomy.

Baltrusaitis, Tadas; Ahuja, Chaitanya; Morency, Louis-Philippe.

IEEE Trans Pattern Anal Mach Intell ; 41(2): 423-443, 2019 02.

Artigo em Inglês | MEDLINE | ID: mdl-29994351

RESUMO

Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research.

7.

The Cambridge Face Tracker: Accurate, Low Cost Measurement of Head Posture Using Computer Vision and Face Recognition Software.

Thomas, Peter B M; Baltrusaitis, Tadas; Robinson, Peter; Vivian, Anthony J.

Transl Vis Sci Technol ; 5(5): 8, 2016 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-27730008

RESUMO

PURPOSE: We validate a video-based method of head posture measurement. METHODS: The Cambridge Face Tracker uses neural networks (constrained local neural fields) to recognize facial features in video. The relative position of these facial features is used to calculate head posture. First, we assess the accuracy of this approach against videos in three research databases where each frame is tagged with a precisely measured head posture. Second, we compare our method to a commercially available mechanical device, the Cervical Range of Motion device: four subjects each adopted 43 distinct head postures that were measured using both methods. RESULTS: The Cambridge Face Tracker achieved confident facial recognition in 92% of the approximately 38,000 frames of video from the three databases. The respective mean error in absolute head posture was 3.34°, 3.86°, and 2.81°, with a median error of 1.97°, 2.16°, and 1.96°. The accuracy decreased with more extreme head posture. Comparing The Cambridge Face Tracker to the Cervical Range of Motion Device gave correlation coefficients of 0.99 (P < 0.0001), 0.96 (P < 0.0001), and 0.99 (P < 0.0001) for yaw, pitch, and roll, respectively. CONCLUSIONS: The Cambridge Face Tracker performs well under real-world conditions and within the range of normally-encountered head posture. It allows useful quantification of head posture in real time or from precaptured video. Its performance is similar to that of a clinically validated mechanical device. It has significant advantages over other approaches in that subjects do not need to wear any apparatus, and it requires only low cost, easy-to-setup consumer electronics. TRANSLATIONAL RELEVANCE: Noncontact assessment of head posture allows more complete clinical assessment of patients, and could benefit surgical planning in future.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA