Abstract: In this work, we propose an approach for unsupervised depth and
camera motion estimation. We focus on two aspects of the view synthesis
approach. We learn attention maps that are used to increase the error
contribution of the meaningful regions on the error computation between
the synthesized and the real views. Moreover, we propose to approximate
the uncertainty of pixels of being visible by mapping the depth consistency
to probabilities using a Gaussian function. The attention maps are learned
simultaneously with the depth network in an end-to-end manner.
We evaluated our method on the depth estimation and odometry tasks
of the KITTI benchmark. The inclusion of the attention and visibility
maps shows an improvement in depth and camera motion estimation tasks
when compared with a competitive baseline method.
0 Replies
Loading