Dual-scale shifted window attention network for medical image segmentation.

Han, De-Wei; Yin, Xiao-Lei; Xu, Jian; Li, Kang; Li, Jun-Jie; Wang, Lu; Ma, Zhao-Yuan

Han, De-Wei; Yin, Xiao-Lei; Xu, Jian; Li, Kang; Li, Jun-Jie; Wang, Lu; Ma, Zhao-Yuan.

Affiliation

Han DW; School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, 1088 Xueyuan Boulevard, Nanshan District, Shenzhen, 518055, China.
Yin XL; The Future Laboratory, Tsinghua University, 160 Chengfu Road, Haidian District, 100084, Beijing, China.
Xu J; School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, 1088 Xueyuan Boulevard, Nanshan District, Shenzhen, 518055, China.
Li K; School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, 1088 Xueyuan Boulevard, Nanshan District, Shenzhen, 518055, China.
Li JJ; School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, 1088 Xueyuan Boulevard, Nanshan District, Shenzhen, 518055, China.
Wang L; The Future Laboratory, Tsinghua University, 160 Chengfu Road, Haidian District, 100084, Beijing, China.
Ma ZY; School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, 1088 Xueyuan Boulevard, Nanshan District, Shenzhen, 518055, China. mazy_sustech@126.com.

Sci Rep ; 14(1): 17719, 2024 Jul 31.

Article in En | MEDLINE | ID: mdl-39085430

ABSTRACT

ABSTRACT

Swin Transformer is an important work among all the attempts to reduce the computational complexity of Transformers while maintaining its excellent performance in computer vision. Window-based patch self-attention can use the local connectivity of the image features, and the shifted window-based patch self-attention enables the communication of information between different patches in the entire image scope. Through in-depth research on the effects of different sizes of shifted windows on the patch information communication efficiency, this article proposes a Dual-Scale Transformer with double-sized shifted window attention method. The proposed method surpasses CNN-based methods such as U-Net, AttenU-Net, ResU-Net, CE-Net by a considerable margin (Approximately 3% â¼ 6% increase), and outperforms the Transformer based models single-scale Swin Transformer(SwinT)(Approximately 1% increase), on the datasets of the Kvasir-SEG, ISIC2017, MICCAI EndoVisSub-Instrument and CadVesSet. The experimental results verify that the proposed dual scale shifted window attention benefits the communication of patch information and can enhance the segmentation results to state of the art. We also implement an ablation study on the effect of the shifted window size on the information flow efficiency and verify that the dual-scale shifted window attention is the optimized network design. Our study highlights the significant impact of network structure design on visual performance, providing valuable insights for the design of networks based on Transformer architectures.

Key words

Dual-scale shifted window attention; Medical image segmentation; Swin Transformer

Fulltext

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Sci Rep Year: 2024 Type: Article Affiliation country: China

Fulltext

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Sci Rep Year: 2024 Type: Article Affiliation country: China