Instruct2See: Learning to Remove Any Obstructions Across Distributions

Junhang Li; Yu Guo; Chuhua XIAN; Shengfeng He

Instruct2See: Learning to Remove Any Obstructions Across Distributions

Junhang Li, Yu Guo, Chuhua XIAN, Shengfeng He

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Images are often obstructed by various obstacles due to capture limitations, hindering the observation of objects of interest. Most existing methods address occlusions from specific elements like fences or raindrops, but are constrained by the wide range of real-world obstructions, making comprehensive data collection impractical. To overcome these challenges, we propose Instruct2See, a novel zero-shot framework capable of handling both seen and unseen obstacles. The core idea of our approach is to unify obstruction removal by treating it as a soft-hard mask restoration problem, where any obstruction can be represented using multi-modal prompts, such as visual semantics and textual instructions, processed through a cross-attention unit to enhance contextual understanding and improve mode control. Additionally, a tunable mask adapter allows for dynamic soft masking, enabling real-time adjustment of inaccurate masks. Extensive experiments on both in-distribution and out-of-distribution obstacles show that Instruct2See consistently achieves strong performance and generalization in obstruction removal, regardless of whether the obstacles were present during the training phase. Code and dataset are available at https://jhscut.github.io/Instruct2See.

Lay Summary: Images are often obstructed by various obstacles due to capture limitations, but existing methods can't handle all types of obstruction. We create Instruct2See, which adopts multi-modal prompts to automatically remove unwanted obstructions. Unlike traditional methods, it removes any obstruction without extra training required for specific types, making image restoration universal and effortless.

Primary Area: Applications->Computer Vision

Keywords: computer vision, image restoration

Submission Number: 2731

Loading