OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images

Published: 2024, Last Modified: 25 Mar 2026IEEE Trans. Geosci. Remote. Sens. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Oriented object detection in remote sensing images is a challenging task due to objects being distributed in multiorientation. Recently, end-to-end transformer-based methods have achieved success by eliminating the need for post-processing operators compared to traditional convolutional neural network (CNN)-based methods. However, directly extending transformers to oriented object detection presents three main issues: 1) objects rotate arbitrarily, necessitating the encoding of angles along with position and size; 2) the geometric relations of oriented objects are lacking in self-attention, due to the absence of interaction between content and positional queries; and 3) oriented objects cause misalignment, mainly between values and positional queries in cross-attention, making accurate classification and localization difficult. In this article, we propose an end-to-end transformer-based oriented object detector, consisting of three dedicated modules to address these issues. First, Gaussian positional encoding (PE) is proposed to encode the angle, position, and size of oriented boxes using Gaussian distributions. Second, Wasserstein self-attention is proposed to introduce geometric relations and facilitate interaction between content and positional queries by utilizing Gaussian Wasserstein distance scores. Third, oriented cross-attention is proposed to align values and positional queries by rotating sampling points around the positional query according to their angles. Experiments on six datasets DIOR-R, a series of DOTA, HRSC2016, and ICDAR2015 show the effectiveness of our approach. Compared with previous end-to-end detectors, the OrientedFormer gains 1.16 and 1.21 AP50 on DIOR-R and DOTA-v1.0, respectively, while reducing training epochs from $3\times $ to $1\times $ . The code is available at https://github.com/wokaikaixinxin/OrientedFormer .
Loading