GRRSIS: Generalized Referring Remote Sensing Image Segmentation

Wenyu Mi, Jianji Wang, Fuzhen Zhuang, Nanning Zheng

Published: 01 Jan 2025, Last Modified: 15 Jan 2026IEEE Transactions on Geoscience and Remote SensingEveryoneRevisionsCC BY-SA 4.0
Abstract: Referring remote sensing image segmentation (RRSIS) is a challenging task that involves segmenting target instances within a top-view image guided by a natural language expression. Existing classic RRSIS methods commonly support target expressions only, i.e., the target described by the expression is present in the image. No-target expressions are excluded. Under this constraint, the model may face significant challenges. For instance, a small error, such as a typographical mistake, could cause a complete failure of the model. To overcome this issue, in this article, we introduce a new benchmark called generalized RRSIS (GRRSIS), which extends classic RRSIS by allowing expressions to refer to no-target objects. Toward this, we construct the first large-scale dataset for GRRSIS, called GRRSIS-D, which includes multitarget, single-target, and no-target expressions. Core challenges in GRRSIS stem from the fact that objects in aerial images often occupy only a small number of pixels, exhibit significant orientation variations, and present varying levels of recognition difficulty. To tackle these challenges, we propose an oriented-aware multiscale network with an adaptive angle sensing module that integrates adaptive rotated convolution and a gating mechanism to capture diverse object orientations while suppressing irrelevant features for more accurate representations. In addition, we introduce a novel online hard case mining loss, which allocates varying levels of attention to foreground and background regions and reshapes the standard loss by downweighting well-segmented examples, effectively addressing the issues caused by low pixel occupancy and uneven sample difficulty. The proposed approach achieves state-of-the-art performance on both the newly introduced GRRSIS and classic RRSIS tasks.
Loading