Abstract: Perspective distortion (PD) causes unprecedented changes in shape,
size, orientation, angles, and other spatial relationships of visual concepts in images.
Precisely estimating camera intrinsic and extrinsic parameters is a challenging
task that prevents synthesizing perspective distortion. Non-availability of
dedicated training data poses a critical barrier to developing robust computer vision
methods. Additionally, distortion correction methods make other computer
vision tasks a multi-step approach and lack performance. In this work, we propose
mitigating perspective distortion (MPD) by employing a fine-grained parameter
control on a specific family of Möbius transform to model real-world
distortion without estimating camera intrinsic and extrinsic parameters and without
the need for actual distorted data. Also, we present a dedicated perspectively
distorted benchmark dataset, ImageNet-PD, to benchmark the robustness of deep
learning models against this new dataset. The proposed method outperforms existing
benchmarks, ImageNet-E and ImageNet-X. Additionally, it significantly
improves performance on ImageNet-PD while consistently performing on standard
data distribution. Notably, our method shows improved performance on
three PD-affected real-world applications—crowd counting, fisheye image recognition,
and person re-identification—and one PD-affected challenging CV task:
object detection. The source code, dataset, and models are available on the project
webpage at https://prakashchhipa.github.io/projects/mpd.
Loading