MAXIM: Multi-Axis MLP for Image Processing
Google (United States) · The University of Texas at Austin
Abstract
Recent progress on Transformers and multilayer perceptron (MLP) models provide new network architectural designs for computer vision tasks. Although these models proved to be effective in many vision tasks such as image recognition, there remain challenges in adapting them for lowlevel vision. The inflexibility to support high-resolution images and limitations of local attention are perhaps the main bottlenecks. In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks. MAXIM uses a UNet-shaped hierarchical structure and supports long-range interactions enabled by spatially-gated MLPs.…
Citation impact
- FWCI
- 31.05
- Percentile
- 100%
- References
- 154
Authors
7Topics & keywords
- Computer science
- Maxim
- Image processing
- Computer vision
- Artificial intelligence
- Image (mathematics)
- Computer graphics (images)
- Industry, innovation and infrastructure