Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
American Society for Photogrammetry and Remote Sensing
Abstract
Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth from a single image is geometrically ill-posed and requires scene understanding, so it is not surprising that the rise of deep learning has led to a breakthrough. The impressive progress of monocular depth estimators has mirrored the growth in model capacity, from relatively modest CNNs to large Transformer architectures. Still, monocular depth estimators tend to struggle when presented with images with unfamiliar content and layout, since their knowledge of the visual world is restricted by the data seen during training, and challenged by zero-shot generalization to new domains. This motivates us to explore whether the…
Citation impact
- FWCI
- 64.42
- Percentile
- 100%
- References
- 87
Authors
6- BKBingxin KeCorresponding
American Society for Photogrammetry and Remote Sensing
- AOAnton Obukhov
American Society for Photogrammetry and Remote Sensing
- SHShengyu Huang
American Society for Photogrammetry and Remote Sensing
- NMNando Metzger
American Society for Photogrammetry and Remote Sensing
- RCRodrigo Caye Daudt
American Society for Photogrammetry and Remote Sensing
Topics & keywords
- Repurposing
- Monocular
- Computer science
- Diffusion
- Estimation
- Image (mathematics)
- Artificial intelligence
- Computer vision
- Climate action