Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-Shot Metric Depth and Surface Normal Estimation
Hong Kong University of Science and Technology · The University of Adelaide · +6 more institutions
Abstract
We introduce Metric3D v2, a geometric foundation model designed for zero-shot metric depth and surface normal estimation from single images, critical for accurate 3D recovery. Depth and normal estimation, though complementary, present distinct challenges. State-of-the-art monocular depth methods achieve zero-shot generalization through affine-invariant depths, but fail to recover real-world metric scale. Conversely, current normal estimation techniques struggle with zero-shot performance due to insufficient labeled data. We propose targeted solutions for both metric depth and normal estimation. For metric depth, we present a canonical camera space transformation module that resolves metric ambiguity across…
Citation impact
- FWCI
- 33.45
- Percentile
- 100%
- References
- 132
Authors
10Topics & keywords
- Artificial intelligence
- Metric (unit)
- Computer vision
- Computer science
- Affine transformation
- Monocular
- Mathematics
- Geometry