Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros

Banco de datos
Tipo del documento
Asunto de la revista
País de afiliación
Intervalo de año de publicación
1.
IEEE Trans Image Process ; 32: 964-979, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37022006

RESUMEN

Human-Object Interaction (HOI) detection recognizes how persons interact with objects, which is advantageous in autonomous systems such as self-driving vehicles and collaborative robots. However, current HOI detectors are often plagued by model inefficiency and unreliability when making a prediction, which consequently limits its potential for real-world scenarios. In this paper, we address these challenges by proposing ERNet, an end-to-end trainable convolutional-transformer network for HOI detection. The proposed model employs an efficient multi-scale deformable attention to effectively capture vital HOI features. We also put forward a novel detection attention module to adaptively generate semantically rich instance and interaction tokens. These tokens undergo pre-emptive detections to produce initial region and vector proposals that also serve as queries which enhances the feature refinement process in the transformer decoders. Several impactful enhancements are also applied to improve the HOI representation learning. Additionally, we utilize a predictive uncertainty estimation framework in the instance and interaction classification heads to quantify the uncertainty behind each prediction. By doing so, we can accurately and reliably predict HOIs even under challenging scenarios. Experiment results on the HICO-Det, V-COCO, and HOI-A datasets demonstrate that the proposed model achieves state-of-the-art performance in detection accuracy and training efficiency. Codes are publicly available at https://github.com/Monash-CyPhi-AI-Research-Lab/ernet.


Asunto(s)
Atención , Humanos , Incertidumbre
2.
Soft Robot ; 10(6): 1224-1240, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37590485

RESUMEN

Data-driven methods with deep neural networks demonstrate promising results for accurate modeling in soft robots. However, deep neural network models rely on voluminous data in discovering the complex and nonlinear representations inherent in soft robots. Consequently, while it is not always possible, a substantial amount of effort is required for data acquisition, labeling, and annotation. This article introduces a data-driven learning framework based on synthetic data to circumvent the exhaustive data collection process. More specifically, we propose a novel time series generative adversarial network with a self-attention mechanism, Transformer TimeGAN (TTGAN) to precisely learn the complex dynamics of a soft robot. On top of that, the TTGAN is incorporated with a conditioning network that enables it to produce synthetic data for specific soft robot behaviors. The proposed framework is verified on a widely used pneumatic-based soft gripper as an exemplary experimental setup. Experimental results demonstrate that the TTGAN generates synthetic time series data with realistic soft robot dynamics. Critically, a combination of the synthetic and only partially available original data produces a data-driven model with estimation accuracy comparable to models obtained from using complete original data.

3.
Soft Robot ; 9(3): 591-612, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-34171965

RESUMEN

Sensory data are critical for soft robot perception. However, integrating sensors to soft robots remains challenging due to their inherent softness. An alternative approach is indirect sensing through an estimation scheme, which uses robot dynamics and available measurements to estimate variables that would have been measured by sensors. Nevertheless, developing an adequately effective estimation scheme for soft robots is not straightforward. First, it requires a mathematical model; modeling of soft robots is analytically demanding due to their complex dynamics. Second, it should perform multimodal sensing for both internal and external variables, with minimal sensors, and finally, it must be robust against sensor faults. In this article, we propose a recurrent neural network-based adaptive unscented Kalman filter (RNN-AUKF) architecture to estimate the proprioceptive state and exteroceptive unknown input of a pneumatic-based soft finger. To address the challenge in modeling soft robots, we adopt a data-driven approach using RNNs. Then, we interconnect the AUKF with an unknown input estimator to perform multimodal sensing using a single embedded flex sensor. We also prove mathematically that the estimation error is bounded with respect to sensor degradation (noise and drift). Experimental results show that the RNN-AUKF achieves a better overall performance in terms of accuracy and robustness against the benchmark method. The proposed scheme is also extended to a multifinger soft gripper and is robust against out-of-distribution sensor dynamics. The outcomes of this research have immense potentials in realizing a robust multimodal indirect sensing in soft robots.


Asunto(s)
Robótica , Modelos Teóricos , Redes Neurales de la Computación , Propiocepción , Robótica/métodos
4.
J Imaging ; 6(12)2020 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-34460527

RESUMEN

Several studies on micro-expression recognition have contributed mainly to accuracy improvement. However, the computational complexity receives lesser attention comparatively and therefore increases the cost of micro-expression recognition for real-time application. In addition, majority of the existing approaches required at least two frames (i.e., onset and apex frames) to compute features of every sample. This paper puts forward new facial graph features based on 68-point landmarks using Facial Action Coding System (FACS). The proposed feature extraction technique (FACS-based graph features) utilizes facial landmark points to compute graph for different Action Units (AUs), where the measured distance and gradient of every segment within an AU graph is presented as feature. Moreover, the proposed technique processes ME recognition based on single input frame sample. Results indicate that the proposed FACS-baed graph features achieve up to 87.33% of recognition accuracy with F1-score of 0.87 using leave one subject out cross-validation on SAMM datasets. Besides, the proposed technique computes features at the speed of 2 ms per sample on Xeon Processor E5-2650 machine.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA