EfficientViT: Lightweight Multi-Scale Attention for High-Resolution Dense Prediction
Moscow Institute of Thermal Technology · Zhejiang University · +2 more institutions
Abstract
High-resolution dense prediction enables many appealing real-world applications, such as computational photography, autonomous driving, etc. However, the vast computational cost makes deploying state-of-the-art high-resolution dense prediction models on hardware devices difficult. This work presents EfficientViT, a new family of high-resolution vision models with novel lightweight multi-scale attention. Unlike prior high-resolution dense prediction models that rely on heavy self-attention, hardware-inefficient large-kernel convolution, or complicated topology structure to obtain good performances, our lightweight multi-scale attention achieves a global receptive field and multi-scale learning (two critical…
Citation impact
- FWCI
- 29.71
- Percentile
- 100%
- References
- 74
Authors
5Topics & keywords
- Computer science
- Speedup
- Kernel (algebra)
- Field-programmable gate array
- Cloud computing
- High resolution
- Supercomputer
- Computational photography
- Sustainable cities and communities