articleJul 10, 2023GOLD OA
ConceptFusion: Open-set multimodal 3D mapping
Indexed incrossref
Abstract
Modalities such as natural language, images, and audio.We demonstrate that pixel-aligned open-set features can be fused into 3D maps via traditional SLAM and multi-view fusion approaches.This enables effective zero-shot spatial reasoning, not needing any additional training or finetuning, and retains long-tailed concepts better than supervised approaches, outperforming them by more than 40% margin on 3D IoU.We extensively evaluate ConceptFusion on a number of real-world datasets, simulated home environments, a real-world tabletop manipulation task, and an autonomous driving platform.We showcase new avenues for blending foundation models with 3D open-set multimodal mapping.We encourage the reader to view the…
Citation impact
185
total citations
- FWCI
- 185.96
- Percentile
- 100%
- References
- 100
Citations per year
Authors
17Topics & keywords
Topics
Keywords
- Computer science
- Set (abstract data type)
- Artificial intelligence
- Computer vision
- Programming language
No related works found for this paper.