Abstract: This paper presents a fast and robust architecture for scene understanding for aerial images recorded from an Unmanned Aerial Vehicle. The architecture uses Deep Wavelet Scattering Network to extract Translation and Rotation Invariant features that are then used by a Conditional Random Field to perform scene segmentation. Experiments are conducted using the proposed framework on two annotated datasets of 1277 images and 300 aerial images, introduced in the paper. An overall pixel accuracy of 81 % and 78 % is achieved for the datasets. A comparison with another similar framework is also presented.
0 Replies
Loading