Abstract: Perspective distortion (PD) leads to substantial alterations
in the shape, size, orientation, angles, and spatial relationships of visual
elements in images. Accurately determining camera intrinsic and extrinsic
parameters is challenging, making it hard to synthesize perspective
distortion effectively. The current distortion correction methods involve
removing distortion and learning vision tasks, thus making it a multistep
process, often compromising performance. Recent work leverages
the M¨obius transform for mitigating perspective distortions (MPD) to
synthesize perspective distortions without estimating camera parameters.
An essential downside of using the M¨obius transform is that it requires
tuning multiple interdependent and interrelated parameters and
involving complex arithmetic operations, leading to substantial computational
complexity. To address these challenges, we propose Log Conformal
Maps (LCM), a method leveraging the logarithmic function to
approximate perspective distortions with fewer parameters and reduced
computational complexity. We provide a theoretical foundation complemented
with experiments to demonstrate that LCM with fewer parameters
approximates the MPD. We show that LCM integrates well with
supervised and self-supervised representation learning, outperform standard
models, and matches the state-of-the-art performance in mitigating
perspective distortion over multiple benchmarks, namely Imagenet-PD,
Imagenet-E, and Imagenet-X. Further LCM demonstrate seamless integration
with person re-identification and improved the performance.
Source code shall be released to foster the research.
Loading