ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning

Pei Deng, Wenqian Zhou, Hanlin Wu

Published: 2025, Last Modified: 04 Apr 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Remote sensing (RS) change analysis is vital for monitoring Earth’s dynamic processes by detecting alterations in images over time. Traditional change detection methods excels at identifying pixel-level changes but lacks the ability to contextualize these changes. While recent advancements in change captioning offer natural language descriptions of changes, they do not support interactive, user-specific queries. To address these limitations, we introduce ChangeChat, the first bitemporal vision-language model (VLM) specifically designed for interactive RS change analysis. ChangeChat leverages multimodal instruction tuning to handle complex queries such as change captioning, category-specific quantification, and change localization. To further enhance the model’s capabilities, we developed the ChangeChat-87k dataset, created using a combination of rule-based methods and GPT-assisted techniques. Experimental results demonstrate that ChangeChat provides a comprehensive, interactive solution for RS change analysis. It achieves performance comparable to or surpassing state-of-the-art (SOTA) methods on specific tasks, while significantly outperforming the latest general-domain model, GPT-4. Code and pre-trained weights are available at https://github.com/hanlinwu/ChangeChat.
Loading