Keywords: self attention, attention mechanism, cross-correlation
TL;DR: A novel parameter-free self-attention with linear complexity is proposed to enhance convolution.
Abstract: Convolution and self-attention, with their characteristics and complementing each other, are two powerful techniques in vision tasks. The ability of self-attention to capture long-range dependencies compensates for the lack of convolution in understanding global feature information. However, the quadratic computational complexity of self-attention impedes their direct combination. This paper proposes global spatial correlation attention (GSCA), which is a self-attention approximation with linear computational complexity without any additional parameters. The aim is to adjust the attention distribution in the global space by utilizing the statistical relationships of the input feature maps themselves. We compress the key matrix into a vector and evaluate the pairwise affinity of each pixel with the key vector in terms of the cross-correlation coefficient, and apply the attention weights to the inputs using the Hadamard product. A multi-head attention form is further built to enhance the module's ability to capture the feature subspace. Based on the above lightweight operations, the proposed method can simply and effectively improve the aggregation capability of convolution for global information. We extensively evaluate our GSCA module on image classification, object detection, and instance segmentation tasks. Parameter-free GSCA is lighter than state-of-the-arts while achieving very competitive performance. It is combined with channel attentions, which also further outperforms the original methods. The experiments also demonstrate the generalizability and robustness of GSCA. The source code is available at GSCA.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
Supplementary Material: zip
12 Replies
Loading