Your browser doesn't support javascript.
loading
MDC-RHT: Multi-Modal Medical Image Fusion via Multi-Dimensional Dynamic Convolution and Residual Hybrid Transformer.
Wang, Wenqing; He, Ji; Liu, Han; Yuan, Wei.
Afiliação
  • Wang W; School of Automation and Information Engineering, Xi'an University of Technology, Xi'an 710048, China.
  • He J; Shaanxi Key Laboratory of Complex System Control and Intelligent Information Processing, Xi'an University of Technology, Xi'an 710048, China.
  • Liu H; School of Automation and Information Engineering, Xi'an University of Technology, Xi'an 710048, China.
  • Yuan W; School of Automation and Information Engineering, Xi'an University of Technology, Xi'an 710048, China.
Sensors (Basel) ; 24(13)2024 Jun 21.
Article em En | MEDLINE | ID: mdl-39000834
ABSTRACT
The fusion of multi-modal medical images has great significance for comprehensive diagnosis and treatment. However, the large differences between the various modalities of medical images make multi-modal medical image fusion a great challenge. This paper proposes a novel multi-scale fusion network based on multi-dimensional dynamic convolution and residual hybrid transformer, which has better capability for feature extraction and context modeling and improves the fusion performance. Specifically, the proposed network exploits multi-dimensional dynamic convolution that introduces four attention mechanisms corresponding to four different dimensions of the convolutional kernel to extract more detailed information. Meanwhile, a residual hybrid transformer is designed, which activates more pixels to participate in the fusion process by channel attention, window attention, and overlapping cross attention, thereby strengthening the long-range dependence between different modes and enhancing the connection of global context information. A loss function, including perceptual loss and structural similarity loss, is designed, where the former enhances the visual reality and perceptual details of the fused image, and the latter enables the model to learn structural textures. The whole network adopts a multi-scale architecture and uses an unsupervised end-to-end method to realize multi-modal image fusion. Finally, our method is tested qualitatively and quantitatively on mainstream datasets. The fusion results indicate that our method achieves high scores in most quantitative indicators and satisfactory performance in visual qualitative analysis.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Sensors (Basel) Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China País de publicação: Suíça

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Sensors (Basel) Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China País de publicação: Suíça