DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
Johns Hopkins University · Google (United States)
Abstract
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. While prevalent multi-modal methods [34], [36] simply decorate raw lidar point clouds with camera features and feed them directly to existing 3D detection models, our study shows that fusing camera features with deep lidar features instead of raw points, can lead to better performance. However, as those features are often augmented and aggregated, a key challenge in fusion is how to effectively align the transformed features from two modalities. In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e.g., rotation, to enable accurate geometric…
Citation impact
- FWCI
- 28.21
- Percentile
- 100%
- References
- 64
Authors
12Topics & keywords
- Lidar
- Computer science
- Artificial intelligence
- Computer vision
- Robustness (evolution)
- Point cloud
- Object detection
- Pixel
- Peace, Justice and strong institutions