articleJun 10, 2025Closed access
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Indexed incrossref
Abstract
We propose a novel hybrid Mamba-Transformer backbone, MambaVision, specifically tailored for vision applications. Our core contribution includes redesigning the Mamba formulation to enhance its capability for efficient modeling of visual features. Through a comprehensive ablation study, we demonstrate the feasibility of integrating Vision Transformers (ViT) with Mamba. Our results show that equipping the Mamba architecture with self-attention blocks in the final layers greatly improves its capacity to capture longrange spatial dependencies. Based on these findings, we introduce a family of MambaVision models with a hierarchical architecture to meet various design criteria. For classification on the ImageNet-1K…
Citation impact
179
total citations
- FWCI
- 178.37
- Percentile
- 100%
- References
- 0
Citations per year
Authors
2Topics & keywords
Keywords
- Transformer
- Computer science
- Electrical engineering
- Engineering
- Voltage
UN Sustainable Development Goals
- Affordable and clean energy
No related works found for this paper.