Spatial decomposition and aggregation for attention in convolutional neural networks

Published in International Journal of Pattern Recognition and Artificial Intelligence, 2024

Abstract

Channel attention has been shown to improve the performance of deep convolutional neural networks eﬃciently. Channel attention adaptively recalibrates the importance of each channel, determining what to attend to. However, channel attention only encodes inter-channel information but neglects the importance of positional information. Posi-tional information is crucial in determining where to attend to. To address this issue, we propose a novel channel-spatial attention method named Spatial-Decomposition-Aggregation Attention (SDAA) method. First, a high-axis spatial direction is decom-posed into multiple low-axis spatial directions. Then, a shared transformation sub-unit establishes attention in each low-axis space direction. Next, all the low-axis attention masks are aggregated into a high-axis attention mask. Finally, the generated high-axis attention mask is fused into the input features, thus enhancing the input features. Essen-tially, our method is a divide-and-conquer process. Experimental results demonstrate that our SDAA method outperforms the existing channel-spatial attention methods.

Recommended citation: Meng Zhu, Weidong Min*, Hongyue Xiang, et al. Spatial decomposition and aggregation for attention in convolutional neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 2024, 38 (1): 1-15. DOI: 10.1142/S0218001423520195.
Download Paper | Download Slides

Share on

微博

微信

Bluesky Facebook LinkedIn X (formerly Twitter)