articleJul 10, 2023GOLD OA

ConceptFusion: Open-set multimodal 3D mapping

Indexed incrossref

Abstract

Modalities such as natural language, images, and audio.We demonstrate that pixel-aligned open-set features can be fused into 3D maps via traditional SLAM and multi-view fusion approaches.This enables effective zero-shot spatial reasoning, not needing any additional training or finetuning, and retains long-tailed concepts better than supervised approaches, outperforming them by more than 40% margin on 3D IoU.We extensively evaluate ConceptFusion on a number of real-world datasets, simulated home environments, a real-world tabletop manipulation task, and an autonomous driving platform.We showcase new avenues for blending foundation models with 3D open-set multimodal mapping.We encourage the reader to view the…

Citation impact

185
total citations
FWCI
185.96
Percentile
100%
References
100
Citations per year

Authors

17

Topics & keywords

Keywords
  • Computer science
  • Set (abstract data type)
  • Artificial intelligence
  • Computer vision
  • Programming language
No related works found for this paper.

Funding