Your browser doesn't support javascript.
loading
Reliable object tracking by multimodal hybrid feature extraction and transformer-based fusion.
Sun, Hongze; Liu, Rui; Cai, Wuque; Wang, Jun; Wang, Yue; Tang, Huajin; Cui, Yan; Yao, Dezhong; Guo, Daqing.
Affiliation
  • Sun H; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Liu R; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Cai W; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Wang J; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Wang Y; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Tang H; College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China.
  • Cui Y; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China; Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, Chengdu 611731
  • Yao D; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China; Research Unit of NeuroInformation (2019RU035), Chinese Academy of Medical Sciences, Chengdu
  • Guo D; Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China. Electronic address: dqguo@uestc.edu.cn.
Neural Netw ; 178: 106493, 2024 Oct.
Article in En | MEDLINE | ID: mdl-38970946
ABSTRACT
Visual object tracking, which is primarily based on visible light image sequences, encounters numerous challenges in complicated scenarios, such as low light conditions, high dynamic ranges, and background clutter. To address these challenges, incorporating the advantages of multiple visual modalities is a promising solution for achieving reliable object tracking. However, the existing approaches usually integrate multimodal inputs through adaptive local feature interactions, which cannot leverage the full potential of visual cues, thus resulting in insufficient feature modeling. In this study, we propose a novel multimodal hybrid tracker (MMHT) that utilizes frame-event-based data for reliable single object tracking. The MMHT model employs a hybrid backbone consisting of an artificial neural network (ANN) and a spiking neural network (SNN) to extract dominant features from different visual modalities and then uses a unified encoder to align the features across different domains. Moreover, we propose an enhanced transformer-based module to fuse multimodal features using attention mechanisms. With these methods, the MMHT model can effectively construct a multiscale and multidimensional visual feature space and achieve discriminative feature modeling. Extensive experiments demonstrate that the MMHT model exhibits competitive performance in comparison with that of other state-of-the-art methods. Overall, our results highlight the effectiveness of the MMHT model in terms of addressing the challenges faced in visual object tracking tasks.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Neural Networks, Computer Limits: Humans Language: En Journal: Neural Netw Journal subject: NEUROLOGIA Year: 2024 Document type: Article Affiliation country:

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Neural Networks, Computer Limits: Humans Language: En Journal: Neural Netw Journal subject: NEUROLOGIA Year: 2024 Document type: Article Affiliation country: