N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution
Sogang University · Korea Innotech (South Korea)
Abstract
While some studies have proven that Swin Transformer (Swin) with window self-attention (WSA) is suitable for single image super-resolution (SR), the plain WSA ignores the broad regions when reconstructing high-resolution images due to a limited receptive field. In addition, many deep learning SR methods suffer from intensive computations. To address these problems, we introduce the N-Gram context to the low-level vision with Transformers for the first time. We define N-Gram as neighboring local windows in Swin, which differs from text analysis that views N-Gram as consecutive characters or words. N-Grams interact with each other by sliding-WSA, expanding the regions seen to restore degraded pixels. Using the…
Citation impact
- FWCI
- 21.25
- Percentile
- 100%
- References
- 92
Authors
3Topics & keywords
- Computer science
- Encoder
- Transformer
- Bottleneck
- n-gram
- Pixel
- Artificial intelligence
- Pattern recognition (psychology)