articleMay 29, 2023Closed access

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

Massachusetts Institute of Technology

Indexed incrossref

Abstract

Multi-sensor fusion is essential for an accurate and reliable autonomous driving system. Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with camera features. However, the camera-to-LiDAR projection throws away the semantic density of camera features, hindering the effectiveness of such methods, especially for semantic-oriented tasks (such as 3D scene segmentation). In this paper, we propose BEVFusion, an efficient and generic multi-task multi-sensor fusion framework. It unifies multi-modal features in the shared bird's-eye view (BEV) representation space, which nicely preserves both geometric and semantic information. To achieve this, we diagnose and lift the key efficiency…

Citation impact

985
total citations
FWCI
112.08
Percentile
100%
References
79
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Artificial intelligence
  • Point cloud
  • Computer vision
  • Segmentation
  • Lidar
  • Object detection
  • Pooling
No related works found for this paper.

Funding