Abstract: With the development of deep learning, multi-view stereo has achieved significant progress recently. Due to the expensive three-dimension supervision, self-supervised methods have more potential. In this work, a novel two-stage self-supervised learning framework for multi-view stereo is proposed to overcome photometric dependency and the effect of foreshortening. On considering that accurate depth hypothesis always plays an important role in estimating depth information. Therefore, this work concentrates on designing an adaptive depth sampling module based on neighboring spatial patches propagation, which is determined by the normal maps. From this point of view, a two-stage process is carried out in this work. In detail, the coarse initial depth maps and normal maps are obtained in the first stage, and then the network in the second stage refines the depth sampling module by taking the influence of foreshortening into account. Furthermore, the loss functions are developed including feature-metric consistency to overcome the photometric inconsistency caused by lighting variation. Moreover, the consistency between depth maps and normal maps is also employed in the loss functions. To evaluate the effectiveness of our proposed two-stage framework, the experiments are carried out on the DTU datasets. The experimental results demonstrate that our self-supervised learning framework has excellent performance compared to the baseline methods.
0 Replies
Loading