OpenScene: 3D Scene Understanding with Open Vocabularies

Peng, Songyou; Genova, Kyle; Jiang, Chiyu; Tagliasacchi, Andrea; Pollefeys, Marc; Funkhouser, Thomas

doi:10.1109/cvpr52729.2023.00085

articleJun 1, 2023Closed access

OpenScene: 3D Scene Understanding with Open Vocabularies

SPSongyou Peng KGKyle Genova CJChiyu Jiang ATAndrea Tagliasacchi MPMarc Pollefeys

Max Planck Institute for Intelligent Systems · ETH Zurich · +3 more institutions

Indexed incrossref

Abstract

Traditional 3D scene understanding approaches rely on labeled 3D datasets to train a model for a single task with supervision. We propose OpenScene, an alternative approach where a model predicts dense features for 3D scene points that are co-embedded with text and image pixels in CLIP feature space. This zero-shot approach enables task-agnostic training and open-vocabulary queries. For example, to perform SOTA zero-shot 3D semantic segmentation it first infers CLIP features for every 3D point and later classifies them based on similarities to embeddings of arbitrary class labels. More interestingly, it enables a suite of open-vocabulary scene understanding applications that have never been done before. For…

Citation impact

291

total citations

FWCI: 110.82
Percentile: 100%
References: 86

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Computer science
Vocabulary
Task (project management)
Artificial intelligence
Suite
Feature (linguistics)
Class (philosophy)
Segmentation

No related works found for this paper.