DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

Li, Yingwei; Yu, Adams Wei; Meng, Tianjian; Caine, Ben; Ngiam, Jiquan; Peng, Daiyi; Shen, Junyang; Lu, Yifeng; Zhou, Denny; Le, Quoc V.; Yuille, Alan; Tan, Mingxing

doi:10.1109/cvpr52688.2022.01667

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022Closed access

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

YLYingwei Li AWAdams Wei Yu TMTianjian Meng BCBen Caine JNJiquan Ngiam

Johns Hopkins University · Google (United States)

Indexed incrossref

Abstract

Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. While prevalent multi-modal methods [34], [36] simply decorate raw lidar point clouds with camera features and feed them directly to existing 3D detection models, our study shows that fusing camera features with deep lidar features instead of raw points, can lead to better performance. However, as those features are often augmented and aggregated, a key challenge in fusion is how to effectively align the transformed features from two modalities. In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e.g., rotation, to enable accurate geometric…

Citation impact

511

total citations

FWCI: 28.21
Percentile: 100%
References: 64

Citations per year

Authors

12

Topics & keywords

Topics

Keywords

Lidar
Computer science
Artificial intelligence
Computer vision
Robustness (evolution)
Point cloud
Object detection
Pixel

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.