Mobile-Former: Bridging MobileNet and Transformer
Microsoft Research (United Kingdom) · University of Science and Technology of China
Abstract
We present Mobile-Former, a parallel design of MobileNet and transformer with a two-way bridge in between. This structure leverages the advantages of MobileNet at local processing and transformer at global interaction. And the bridge enables bidirectional fusion of local and global features. Different from recent works on vision transformer, the transformer in Mobile-Former contains very few tokens (e.g. 6 or fewer tokens) that are randomly initialized to learn global priors, resulting in low computational cost. Combining with the proposed light-weight cross attention to model the bridge, Mobile-Former is not only computationally efficient, but also has more representation power. It outperforms MobileNetV3 at…
Citation impact
- FWCI
- 34.22
- Percentile
- 100%
- References
- 86
Authors
7Topics & keywords
- Computer science
- Bridging (networking)
- Encoder
- Transformer
- FLOPS
- Computation
- Mobile device
- Real-time computing