Rotation-aware representation learning for remote sensing image retrieval

Published: 2021, Last Modified: 31 Jan 2025Inf. Sci. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The rising number and size of remote sensing (RS) image archives makes content-based RS image retrieval (CBRSIR) more important. Convolutional neural networks (CNNs) offer good CBRSIR performance, but the features they extract are not rotation-invariant. This is problematic as objects in RS images appear in arbitrary rotation angles. We develop and investigate two new rotation-aware CNN-based CBRSIR methods: 1) In the Feature Map Transformation Based Rotation-Aware Network (FMT-RAN), the last pooling layer is rotated in four different angles during training. Its outputs are passed through the same fully connected-, coding-, and classification layer, and the resulting losses are added. 2) The Spatial Transformer-based Rotation-Aware Network (ST-RAN) contains a spatial transformer network (STN) and a rotation aware network (RAN). For training, the original and a randomly rotated version of an image are fed into the ST-RAN. The STN generates a transformed version of the original to match the rotated image. The RAN extracts the features of all three images. We apply two-stage training, which first optimizes the STN and then the RAN. Both of our methods are efficient in terms of retrieval accuracy and time, but ST-RAN has the overall best performance. It outperforms the state-of-the-art CBRSIR methods.
Loading