DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

Johns Hopkins University · Google (United States)

Indexed incrossref

Abstract

Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. While prevalent multi-modal methods [34], [36] simply decorate raw lidar point clouds with camera features and feed them directly to existing 3D detection models, our study shows that fusing camera features with deep lidar features instead of raw points, can lead to better performance. However, as those features are often augmented and aggregated, a key challenge in fusion is how to effectively align the transformed features from two modalities. In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e.g., rotation, to enable accurate geometric…

Citation impact

511
total citations
FWCI
28.21
Percentile
100%
References
64
Citations per year

Authors

12

Topics & keywords

Keywords
  • Lidar
  • Computer science
  • Artificial intelligence
  • Computer vision
  • Robustness (evolution)
  • Point cloud
  • Object detection
  • Pixel
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.