FisheyeDistanceNet++: Self-Supervised Fisheye Distance Estimation with Self-Attention, Robust Loss Function and Camera View Generalization

Varun Ravi Kumar; Senthil Kumar Yogamani; Stefan Milz; Patrick Mäder

FisheyeDistanceNet++: Self-Supervised Fisheye Distance Estimation with Self-Attention, Robust Loss Function and Camera View Generalization

Varun Ravi Kumar, Senthil Kumar Yogamani, Stefan Milz, Patrick Mäder

Published: 01 Jan 2021, Last Modified: 12 Nov 2024Autonomous Vehicles and Machines 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: FisheyeDistanceNet [1] proposed a self-supervised monocular depth estimation method for fisheye cameras with a large field of view (> 180°). To achieve scale-invariant depth estimation, FisheyeDistanceNet supervises depth map predictions over multiple scales during training. To overcome this bottleneck, we incorporate self-attention layers and robust loss function [2] to FisheyeDistanceNet. A general adaptive robust loss function helps obtain sharp depth maps without a need to train over multiple scales and allows us to learn hyperparameters in loss function to aid in better optimization in terms of convergence speed and accuracy. We also ablate the importance of Instance Normalization over Batch Normalization in the network architecture. Finally, we generalize the network to be invariant to camera views by training multiple perspectives using front, rear, and side cameras. Proposed algorithm improvements, FisheyeDistanceNet++, result in 30% relative improvement in RMSE while reducing the training time by 25% on the WoodScape dataset. We also obtain state-of-the-art results on the KITTI dataset, in comparison to other self-supervised monocular methods.

Loading