An Empirical Study of Spatial Attention Mechanisms in Deep Networks
University of Science and Technology of China · Microsoft Research Asia (China)
Abstract
Attention mechanisms have become a popular component in deep neural networks, yet there has been little examination of how different influencing factors and methods for computing attention from these factors affect performance. Toward a better general understanding of attention mechanisms, we present an empirical study that ablates various spatial attention elements within a generalized attention formulation, encompassing the dominant Transformer attention as well as the prevalent deformable convolution and dynamic convolution modules. Conducted on a variety of applications, the study yields significant findings about spatial attention in deep networks, some of which run counter to conventional understanding.…
Citation impact
- FWCI
- 20.21
- Percentile
- 100%
- References
- 81
Authors
5Topics & keywords
- Computer science
- Artificial intelligence
- Convolution (computer science)
- Transformer
- Encoder
- Key (lock)
- Variety (cybernetics)
- Deep learning