SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Mohamed bin Zayed University of Artificial Intelligence · University of California, Merced · +2 more institutions
Abstract
Self-attention has become a defacto choice for capturing global context in various vision applications. However, its quadratic computational complexity with respect to image resolution limits its use in real-time applications, especially for deployment on resource-constrained mobile devices. Although hybrid approaches have been proposed to combine the advantages of convolutions and self-attention for a better speed-accuracy trade-off, the expensive matrix multiplication operations in self-attention remain a bottleneck. In this work, we introduce a novel efficient additive attention mechanism that effectively replaces the quadratic matrix multiplication operations with linear element-wise multiplications. Our…
Citation impact
- FWCI
- 26.12
- Percentile
- 100%
- References
- 0
Authors
6- ASAbdelrahman ShakerCorresponding
Mohamed bin Zayed University of Artificial Intelligence
- MMMuhammad Maaz
Mohamed bin Zayed University of Artificial Intelligence
- HRHanoona Rasheed
Mohamed bin Zayed University of Artificial Intelligence
- SKSalman Khan
Mohamed bin Zayed University of Artificial Intelligence
- MYMing–Hsuan Yang
University of California, Merced, Google (United States), Yonsei University
Topics & keywords
- Computer science
- Bottleneck
- Matrix multiplication
- Latency (audio)
- Mobile device
- Computer engineering
- Inference
- Multiplication (music)
- Decent work and economic growth