On the Integration of Self-Attention and Convolution

Pan, Xuran; Ge, Chunjiang; Lü, Rui; Song, Shiji; Chen, Guan-Fu; Huang, Zeyi; Huang, Gao

doi:10.1109/cvpr52688.2022.00089

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022Closed access

On the Integration of Self-Attention and Convolution

XPXuran Pan CGChunjiang Ge RLRui Lü SSShiji Song GCGuan-Fu Chen

Tsinghua University · Huawei Technologies (China) · +1 more institution

Indexed incrossref

Abstract

Convolution and self-attention are two powerful techniques for representation learning, and they are usually considered as two peer approaches that are distinct from each other. In this paper, we show that there exists a strong underlying relation between them, in the sense that the bulk of computations of these two paradigms are in fact done with the same operation. Specifically, we first show that a traditional convolution with kernel size k × k can be decomposed into k 2 individual 1 × 1 convolutions, followed by shift and summation operations. Then, we interpret the projections of queries, keys, and values in self-attention module as multiple 1 × 1 convolutions, followed by the computation of attention…

Citation impact

521

total citations

FWCI: 28.50
Percentile: 100%
References: 79

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Convolution (computer science)
Computer science
Kernel (algebra)
Computation
Overhead (engineering)
Representation (politics)
Code (set theory)
Theoretical computer science

No related works found for this paper.