Joint superpixel and Transformer for high resolution remote sensing image classification.

Dang, Guangpu; Mao, Zhongan; Zhang, Tingyu; Liu, Tao; Wang, Tao; Li, Liangzhi; Gao, Yu; Tian, Runqing; Wang, Kun; Han, Ling

Dang, Guangpu; Mao, Zhongan; Zhang, Tingyu; Liu, Tao; Wang, Tao; Li, Liangzhi; Gao, Yu; Tian, Runqing; Wang, Kun; Han, Ling.

Affiliation

Dang G; Shaanxi Provincial Land Engineering Construction Group Land Survey Planning and Design Institute, Xi'an, Shaanxi, China.
Mao Z; Shaanxi Provincial Land Engineering Construction Group, Xi'an, Shaanxi, China.
Zhang T; Key Laboratory of Degraded and Unused Land Consolidation Engineering, the Ministry of Natural Resources, Xi'an, Shaanxi, China. 2016027003@chd.edu.cn.
Liu T; Institute of Land Engineering and Technology, Shaanxi Provincial Land Engineering Construction Group, Xi'an, Shaanxi, China. 2016027003@chd.edu.cn.
Wang T; Land Reserve Center of High tech Development Zone, Xi'an, Shaanxi, China.
Li L; Shaanxi Provincial Land Engineering Construction Group Land Survey Planning and Design Institute, Xi'an, Shaanxi, China.
Gao Y; Chang'an University, Xi'an, Shaanxi, China.
Tian R; Shaanxi Provincial Land Engineering Construction Group Land Survey Planning and Design Institute, Xi'an, Shaanxi, China.
Wang K; Shaanxi Provincial Land Engineering Construction Group Land Survey Planning and Design Institute, Xi'an, Shaanxi, China.
Han L; Shaanxi Provincial Land Engineering Construction Group, Xi'an, Shaanxi, China.

Sci Rep ; 14(1): 5054, 2024 Mar 01.

Article in En | MEDLINE | ID: mdl-38424135

ABSTRACT

ABSTRACT

Deep neural networks combined with superpixel segmentation have proven to be superior to high-resolution remote sensing image (HRI) classification. Currently, most HRI classification methods that combine deep learning and superpixel segmentation use stacking on multiple scales to extract contextual information from segmented objects. However, this approach does not take into account the contextual dependencies between each segmented object. To solve this problem, a joint superpixel and Transformer (JST) framework is proposed for HRI classification. In JST, HRI is first segmented into superpixel objects as input, and Transformer is used to model the long-range dependencies. The contextual relationship between each input superpixel object is obtained and the class of analyzed objects is output by designing an encoding and decoding Transformer. Additionally, we explore the effect of semantic range on classification accuracy. JST is also tested by using two HRI datasets with overall classification accuracy, average accuracy and Kappa coefficients of 0.79, 0.70, 0.78 and 0.91, 0.85, 0.89, respectively. The effectiveness of the proposed method is compared qualitatively and quantitatively, and the results achieve competitive and consistently better than the benchmark comparison method.

Key words

Deep learning; Image classification; Remote sensing image; Superpixel; Transformer

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Sci Rep Year: 2024 Document type: Article Affiliation country: China

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Sci Rep Year: 2024 Document type: Article Affiliation country: China