Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification.

Lyu, Jiahang; Zou, Rongxin; Wan, Qin; Xi, Wang; Yang, Qinglin; Kodagoda, Sarath; Wang, Shifeng

Lyu, Jiahang; Zou, Rongxin; Wan, Qin; Xi, Wang; Yang, Qinglin; Kodagoda, Sarath; Wang, Shifeng.

Affiliation

Lyu J; School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China.
Zou R; School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China.
Wan Q; School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China.
Xi W; School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China.
Yang Q; School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China.
Kodagoda S; Faculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NWS 2007, Australia.
Wang S; School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China.

Sensors (Basel) ; 24(7)2024 Mar 23.

Article in En | MEDLINE | ID: mdl-38610267

ABSTRACT

ABSTRACT

In recent years, computer vision has witnessed remarkable advancements in image classification, specifically in the domains of fully convolutional neural networks (FCNs) and self-attention mechanisms. Nevertheless, both approaches exhibit certain limitations. FCNs tend to prioritize local information, potentially overlooking crucial global contexts, whereas self-attention mechanisms are computationally intensive despite their adaptability. In order to surmount these challenges, this paper proposes cross-and-diagonal networks (CDNet), innovative network architecture that adeptly captures global information in images while preserving local details in a more computationally efficient manner. CDNet achieves this by establishing long-range relationships between pixels within an image, enabling the indirect acquisition of contextual information. This inventive indirect self-attention mechanism significantly enhances the network's capacity. In CDNet, a new attention mechanism named "cross and diagonal attention" is proposed. This mechanism adopts an indirect approach by integrating two distinct components, cross attention and diagonal attention. By computing attention in different directions, specifically vertical and diagonal, CDNet effectively establishes remote dependencies among pixels, resulting in improved performance in image classification tasks. Experimental results highlight several advantages of CDNet. Firstly, it introduces an indirect self-attention mechanism that can be effortlessly integrated as a module into any convolutional neural network (CNN). Additionally, the computational cost of the self-attention mechanism has been effectively reduced, resulting in improved overall computational efficiency. Lastly, CDNet attains state-of-the-art performance on three benchmark datasets for similar types of image classification networks. In essence, CDNet addresses the constraints of conventional approaches and provides an efficient and effective solution for capturing global context in image classification tasks.

Key words

CNN; computer vision; image classification; self-attention mechanism

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Sensors (Basel) Year: 2024 Document type: Article Affiliation country: Country of publication:

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google