Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation
Carnegie Mellon University · Dalian University of Technology
Abstract
Multi-modal semantic segmentation significantly enhances AI agents' perception and scene understanding, especially under adverse conditions like low-light or overexposed environments. Leveraging additional modalities (X-modality) like thermal and depth alongside traditional RGB provides complementary information, enabling more robust and reliable prediction. In this work, we introduce Sigma, a Siamese Mamba network for multi-modal semantic segmentation utilizing the advanced Mamba. Unlike conventional methods that rely on CNNs, with their limited local receptive fields, or Vision Transformers (ViTs), which offer global receptive fields at the cost of quadratic complexity, our model achieves global receptive…
Citation impact
- FWCI
- 117.25
- Percentile
- 100%
- References
- 63
Authors
7Topics & keywords
- Modal
- Computer science
- Sigma
- Segmentation
- Artificial intelligence
- Natural language processing
- Physics
- Astronomy