Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting

Indexed incrossref

Abstract

Open-vocabulary 3D scene understanding presents a significant challenge in computer vision, with wide-ranging applications in embodied agents and augmented reality systems. Existing methods adopt neural rendering methods as 3D representations and jointly optimize color and semantic features to achieve rendering and scene understanding simultaneously. In this paper, we introduce Semantic Gaussians, a novel open-vocabulary scene understanding approach based on 3D Gaussian Splatting. Our key idea is to distill knowledge from 2D pretrained models to 3D Gaussians. Unlike existing methods, we design a versatile projection approach that maps various 2D semantic features from pre-trained image encoders into a novel…

Citation impact

5
total citations
FWCI
22.46
Percentile
98%
References
0
Citations per year

Authors

5

Topics & keywords

Keywords
  • Rendering (computer graphics)
  • Segmentation
  • Component (thermodynamics)
  • Encoder
  • Embodied cognition
  • Object (grammar)
  • Semantic feature
  • Key (lock)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.