Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs
Tsinghua University · Vi Technology (United States) · +2 more institutions
Abstract
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few large convolutional kernels instead of a stack of small kernels could be a more powerful paradigm. We suggested five guidelines, e.g., applying re-parameterized large depthwise convolutions, to design efficient high-performance large-kernel CNNs. Following the guidelines, we propose RepLKNet, a pure CNN architecture whose kernel size is as large as 31×31, in contrast to commonly used 3×3. RepLKNet greatly closes the performance gap between CNNs and ViTs, e.g., achieving comparable or superior results than Swin Transformer on…
Citation impact
- FWCI
- 70.93
- Percentile
- 100%
- References
- 171
Authors
4Topics & keywords
- Computer science
- Kernel (algebra)
- Convolutional neural network
- Parameterized complexity
- Tree kernel
- Scalability
- Artificial intelligence
- Scaling
- Industry, innovation and infrastructure