Unifying Flow, Stereo and Depth Estimation

ETH Zurich · The University of Sydney · +2 more institutions

PubMed
Indexed incrossrefpubmed

Abstract

We present a unified formulation and model for three motion and 3D perception tasks: optical flow, rectified stereo matching and unrectified stereo depth estimation from posed images. Unlike previous specialized architectures for each specific task, we formulate all three tasks as a unified dense correspondence matching problem, which can be solved with a single model by directly comparing feature similarities. Such a formulation calls for discriminative feature representations, which we achieve using a Transformer, in particular the cross-attention mechanism. We demonstrate that cross-attention enables integration of knowledge from another image via cross-view interactions, which greatly improves the quality…

Citation impact

191
total citations
FWCI
21.73
Percentile
100%
References
119
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Artificial intelligence
  • Optical flow
  • Discriminative model
  • Unified Model
  • Inference
  • Transformer
  • Feature (linguistics)
UN Sustainable Development Goals
  • Reduced inequalities
No related works found for this paper.