TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers.

Chen, Jieneng; Mei, Jieru; Li, Xianhang; Lu, Yongyi; Yu, Qihang; Wei, Qingyue; Luo, Xiangde; Xie, Yutong; Adeli, Ehsan; Wang, Yan; Lungren, Matthew P; Zhang, Shaoting; Xing, Lei; Lu, Le; Yuille, Alan; Zhou, Yuyin

Chen, Jieneng; Mei, Jieru; Li, Xianhang; Lu, Yongyi; Yu, Qihang; Wei, Qingyue; Luo, Xiangde; Xie, Yutong; Adeli, Ehsan; Wang, Yan; Lungren, Matthew P; Zhang, Shaoting; Xing, Lei; Lu, Le; Yuille, Alan; Zhou, Yuyin.

Afiliación

Chen J; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Mei J; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Li X; Department of Computer Science and Engineering, University of California, Santa Cruz, CA 95064, USA.
Lu Y; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Yu Q; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Wei Q; Department of Radiation Oncology, Stanford University, Stanford, CA 94305, USA.
Luo X; Shanghai AI Lab, Xuhui District, Shanghai, 200000, China.
Xie Y; The Australian Institute for Machine Learning, University of Adelaide, Australia.
Adeli E; The School of Medicine, Stanford University, Stanford, CA 94305, USA.
Wang Y; The East China Normal University, Shanghai 200062, China.
Lungren MP; The School of Medicine, Stanford University, Stanford, CA 94305, USA.
Zhang S; Shanghai AI Lab, Xuhui District, Shanghai, 200000, China.
Xing L; Department of Radiation Oncology, Stanford University, Stanford, CA 94305, USA.
Lu L; DAMO Academy, Alibaba Group, New York, NY 10014, USA.
Yuille A; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Zhou Y; Department of Computer Science and Engineering, University of California, Santa Cruz, CA 95064, USA. Electronic address: yzhou284@ucsc.edu.

Med Image Anal ; 97: 103280, 2024 Oct.

Article en En | MEDLINE | ID: mdl-39096845

ABSTRACT

ABSTRACT

Medical image segmentation is crucial for healthcare, yet convolution-based methods like U-Net face limitations in modeling long-range dependencies. To address this, Transformers designed for sequence-to-sequence predictions have been integrated into medical image segmentation. However, a comprehensive understanding of Transformers' self-attention in U-Net components is lacking. TransUNet, first introduced in 2021, is widely recognized as one of the first models to integrate Transformer into medical image analysis. In this study, we present the versatile framework of TransUNet that encapsulates Transformers' self-attention into two key modules (1) a Transformer encoder tokenizing image patches from a convolution neural network (CNN) feature map, facilitating global context extraction, and (2) a Transformer decoder refining candidate regions through cross-attention between proposals and U-Net features. These modules can be flexibly inserted into the U-Net backbone, resulting in three configurations Encoder-only, Decoder-only, and Encoder+Decoder. TransUNet provides a library encompassing both 2D and 3D implementations, enabling users to easily tailor the chosen architecture. Our findings highlight the encoder's efficacy in modeling interactions among multiple abdominal organs and the decoder's strength in handling small targets like tumors. It excels in diverse medical applications, such as multi-organ segmentation, pancreatic tumor segmentation, and hepatic vessel segmentation. Notably, our TransUNet achieves a significant average Dice improvement of 1.06% and 4.30% for multi-organ segmentation and pancreatic tumor segmentation, respectively, when compared to the highly competitive nn-UNet, and surpasses the top-1 solution in the BrasTS2021 challenge. 2D/3D Code and models are available at https//github.com/Beckschen/TransUNet and https//github.com/Beckschen/TransUNet-3D, respectively.

Asunto(s)

Procesamiento de Imagen Asistido por Computador; Redes Neurales de la Computación; Humanos; Procesamiento de Imagen Asistido por Computador/métodos; Algoritmos

Palabras clave

Medical image segmentation; U-Net; Vision Transformers

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Procesamiento de Imagen Asistido por Computador / Redes Neurales de la Computación Límite: Humans Idioma: En Revista: Med Image Anal Asunto de la revista: DIAGNOSTICO POR IMAGEM Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google