Abstract: Overhead imageries play a crucial role in many applications such as urban planning,
crop yield forecasting, mapping, and policy making. Semantic segmentation could enable automatic, efficient, and large-scale understanding of overhead imageries for these
applications. However, semantic segmentation of overhead imageries is a challenging
task, primarily due to the large domain gap from existing research in ground imageries,
unavailability of large-scale dataset with pixel-level annotations, and inherent complexity in the task. Readily available vast amount of unlabeled overhead imageries share
more common structures and patterns compared to the ground imageries, therefore, its
large-scale analysis could benefit from unsupervised feature learning techniques.
In this work, we study various self-supervised feature learning techniques for semantic segmentation of overhead imageries. We choose image semantic inpainting as
a self-supervised task [36] for our experiments due to its proximity to the semantic segmentation task. We (i) show that existing approaches are inefficient for semantic segmentation, (ii) propose architectural changes towards self-supervised learning for semantic
segmentation, (iii) propose an adversarial training scheme for self-supervised learning by
increasing the pretext task’s difficulty gradually and show that it leads to learning better
features, and (iv) propose a unified approach for overhead scene parsing, road network
extraction, and land cover estimation. Our approach improves over training from scratch
by more than 10% and ImageNet pre-trained network by more than 5% mIOU.
0 Replies
Loading