ST-YOLOA: a Swin-transformer-based YOLO model with an attention mechanism for SAR ship detection under complex background.

Zhao, Kai; Lu, Ruitao; Wang, Siyu; Yang, Xiaogang; Li, Qingge; Fan, Jiwei

Zhao, Kai; Lu, Ruitao; Wang, Siyu; Yang, Xiaogang; Li, Qingge; Fan, Jiwei.

Affiliation

Zhao K; Department of Automation, Rocket Force University of Engineering, Xi'an, China.
Lu R; Department of Automation, Rocket Force University of Engineering, Xi'an, China.
Wang S; Department of Automation, Rocket Force University of Engineering, Xi'an, China.
Yang X; Department of Automation, Rocket Force University of Engineering, Xi'an, China.
Li Q; Department of Automation, Rocket Force University of Engineering, Xi'an, China.
Fan J; Department of Automation, Rocket Force University of Engineering, Xi'an, China.

Front Neurorobot ; 17: 1170163, 2023.

Article in En | MEDLINE | ID: mdl-37334169

ABSTRACT

ABSTRACT

A synthetic aperture radar (SAR) image is crucial for ship detection in computer vision. Due to the background clutter, pose variations, and scale changes, it is a challenge to construct a SAR ship detection model with low false-alarm rates and high accuracy. Therefore, this paper proposes a novel SAR ship detection model called ST-YOLOA. First, the Swin Transformer network architecture and coordinate attention (CA) model are embedded in the STCNet backbone network to enhance the feature extraction performance and capture global information. Second, we used the PANet path aggregation network with a residual structure to construct the feature pyramid to increase global feature extraction capability. Next, to cope with the local interference and semantic information loss problems, a novel up/down-sampling method is proposed. Finally, the decoupled detection head is used to achieve the predicted output of the target position and the boundary box to improve convergence speed and detection accuracy. To demonstrate the efficiency of the proposed method, we have constructed three SAR ship detection datasets a norm test set (NTS), a complex test set (CTS), and a merged test set (MTS). The experimental results show that our ST-YOLOA achieved an accuracy of 97.37%, 75.69%, and 88.50% on the three datasets, respectively, superior to the effects of other state-of-the-art methods. Our ST-YOLOA performs favorably in complex scenarios, and the accuracy is 4.83% higher than YOLOX on the CTS. Moreover, ST-YOLOA achieves real-time detection with a speed of 21.4 FPS.

Key words

Swin Transformer; YOLO; attention mechanism; ship detection; synthetic aperture radar (SAR) image

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Diagnostic_studies / Prognostic_studies Language: En Journal: Front Neurorobot Year: 2023 Document type: Article Affiliation country:

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google