Abstract: Monocular depth estimation is a fundamental task for 3D scene geometry perception. Several state-of-the-art methods reformulate monocular depth estimation as a per-pixel classification-regression task to improve the model performance, where depth map is predicted by a linear combination of the per-pixel probability distribution and the predicted depth candidates. However, we observe that the depth distribution of different indoor images varies greatly, where some existing methods produce similar depth candidates for such images. In this paper, we propose a novel candidate prediction module to derive adaptive candidates from the depth distribution of the image. Specifically, we use Gaussian mixture model for histogram prediction and apply histogram equalization to obtain depth candidates that are adaptively distributed within the histogram intervals with dense distribution. Furthermore, we present a framework with the proposed module, as well as a histogram loss function. Our framework exhibits significant performance on the NYU-Depth-v2 dataset. Additionally, we assess the generalization ability through zero-shot testing on the SUN RGB-D dataset.
Loading