Abstract: Vision based solutions for the localization of vehicles have become popular recently. In this study, we
employ an image retrieval based visual localization approach, in which database images are kept with
GPS coordinates and the location of the retrieved database image serves as the position estimate of
the query image in a city scale driving scenario. Regarding this approach, most existing studies only
use descriptors extracted from RGB images and do not exploit semantic content. We show that localization can be improved via descriptors extracted from semantically segmented images, especially when the
environment is subjected to severe illumination, seasonal or other long-term changes. We worked on two
separate visual localization datasets, one of which (Malaga Streetview Challenge) has been generated by
us and made publicly available. Following the extraction of semantic labels in images, we trained a CNN
model for localization in a weakly-supervised fashion with triplet ranking loss. The optimized semantic
descriptor can be used on its own for localization or preferably it can be used together with a state-ofthe-art RGB image based descriptor in hybrid fashion to improve accuracy. Our experiments reveal that
the proposed hybrid method is able to increase the localization performance of the standard (RGB image
based) approach up to 7.7% regarding Top-1 Recall values.
Loading