Your browser doesn't support javascript.
loading
PLG-ViT: Vision Transformer with Parallel Local and Global Self-Attention.
Ebert, Nikolas; Stricker, Didier; Wasenmüller, Oliver.
Afiliación
  • Ebert N; Research and Transfer Center CeMOS, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
  • Stricker D; Department of Computer Science, RPTU Kaiserslautern-Landau, 67663 Kaiserslautern, Germany.
  • Wasenmüller O; Department of Computer Science, RPTU Kaiserslautern-Landau, 67663 Kaiserslautern, Germany.
Sensors (Basel) ; 23(7)2023 Mar 25.
Article en En | MEDLINE | ID: mdl-37050507
ABSTRACT
Recently, transformer architectures have shown superior performance compared to their CNN counterparts in many computer vision tasks. The self-attention mechanism enables transformer networks to connect visual dependencies over short as well as long distances, thus generating a large, sometimes even a global receptive field. In this paper, we propose our Parallel Local-Global Vision Transformer (PLG-ViT), a general backbone model that fuses local window self-attention with global self-attention. By merging these local and global features, short- and long-range spatial interactions can be effectively and efficiently represented without the need for costly computational operations such as shifted windows. In a comprehensive evaluation, we demonstrate that our PLG-ViT outperforms CNN-based as well as state-of-the-art transformer-based architectures in image classification and in complex downstream tasks such as object detection, instance segmentation, and semantic segmentation. In particular, our PLG-ViT models outperformed similarly sized networks like ConvNeXt and Swin Transformer, achieving Top-1 accuracy values of 83.4%, 84.0%, and 84.5% on ImageNet-1K with 27M, 52M, and 91M parameters, respectively.
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Sensors (Basel) Año: 2023 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Sensors (Basel) Año: 2023 Tipo del documento: Article País de afiliación: Alemania