Abstract: Omnidirectional cameras are capable of providing 360◦ field-of-view in a single shot. This comprehensive view makes them
preferable for many computer vision applications. An omnidirectional view is generally represented as a panoramic image
with equirectangular projection, which suffers from distortions. Thus, standard camera approaches should be mathematically
modified to be used effectively with panoramic images. In this work, we built a semantic segmentation CNN model that
handles distortions in panoramic images using equirectangular convolutions. The proposed model, we call it UNet-equiconv,
outperforms an equivalent CNN model with standard convolutions. To the best of our knowledge, ours is the first work on
the semantic segmentation of real outdoor panoramic images. Experiment results reveal that using a distortion-aware CNN
with equirectangular convolution increases the semantic segmentation performance (4% increase in mIoU). We also released
a pixel-level annotated outdoor panoramic image dataset which can be used for various computer vision applications such
as autonomous driving and visual localization. Source code of the project and the dataset were made available at the project
page.
Loading