Abstract: Existing All-in-One image restoration methods often fail to perceive degradation types and severity levels simultaneously, overlooking the importance of fine-grained quality perception. Moreover, these methods often utilize highly customized backbones, which hinder their adaptability and integration into more advanced restoration networks. To address these limitations, we propose Perceive-IR, a novel backbone-agnostic All-in-One image restoration framework designed for fine-grained quality control across various degradation types and severity levels. Its modular structure allows core components to function independently of specific backbones, enabling seamless integration into advanced restoration models without significant modifications. Specifically, Perceive-IR operates in two key stages: 1) multi-level quality-driven prompt learning stage, where a fine-grained quality perceiver is meticulously trained to discern three-tier quality levels by optimizing the alignment between prompts and images within the CLIP perception space. This stage ensures a nuanced understanding of image quality, laying the groundwork for subsequent restoration; 2) restoration stage, where the quality perceiver is seamlessly integrated with a difficulty-adaptive perceptual loss, forming a quality-aware learning strategy. This strategy not only dynamically differentiates sample learning difficulty but also achieves fine-grained quality control by driving the restored image toward the ground truth while pulling it away from both low- and medium-quality samples. Furthermore, Perceive-IR incorporates a Semantic Guidance Module (SGM) and Compact Feature Extraction (CFE). The SGM leverages semantic information from pre-trained vision models to provide high-level contextual guidance, while the CFE focuses on extracting degradation-specific features, ensuring accurate handling of diverse image degradations. Extensive experiments demonstrate that Perceive-IR not only surpasses state-of-the-art methods but also generalizes reliably to zero-shot real-world and unknown degraded scenes, while adapting seamlessly to different backbone networks. This versatility underscores the framework’s robustness and backbone-agnostic design. Project page at https://house-yuyu.github.io/Perceive-IR/.
External IDs:doi:10.1109/tip.2025.3566300
Loading