Text-Guided Knowledge Transfer for Remote Sensing Image-Text Retrieval

An-An Liu, Bo Yang, Wenhui Li, Dan Song, Zhengya Sun, Tongwei Ren, Zhiqiang Wei

Published: 2024, Last Modified: 23 Jan 2026IEEE Geosci. Remote. Sens. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Remote sensing text-image retrieval aims to retrieve valuable information from diverse and complex remote sensing data, attracting significant attention. However, the performance is limited due to the complexity of scenes and their substantial content differences from natural domain images. To address these issues, we propose a simple but effective text-guided knowledge transfer (TGKT) method for remote sensing image-text retrieval. TGKT utilizes CLIP to encode remote sensing data and transfer its rich semantic knowledge from natural to remote sensing domain. The textual information without significant domain differences is employed to bridge the semantic gap between these two domains, thereby enhancing image features. The extensive experimental results on both RSICD and RSITMD datasets demonstrate the effectiveness of our method.