DisIR: Disentangled Learning of Controllable All-in-One Image Restoration under Composite Degradations
Keywords: Image restoration, All-in-One
Abstract: Composite degradation scenarios, in which multiple types of degradation are mixed together, have attracted increasing interest in the development of restoration models. Although prior knowledge of degradation types exists, the challenge of precise image restoration persists, particularly when multiple degradations are intricately mixed, and selectively handling individual degradations poses considerable difficulty. To tackle this challenge, we propose DisIR, a novel disentangled framework that learns controllable representations for composite image restoration through four distinct training objectives. First, we introduce an identity embedding as a prompt, along with an identity loss that guides the model to reproduce the input without modification. Second, we design a ratio control mechanism where the identity embedding can be linearly combined with degradation-specific embeddings at controllable ratios, enabling fine-grained restoration intensity control through a dedicated ratio control loss. Third, to disentangle multiple degradations, we incorporate an intermediate loss that supervises intermediate outputs, each aimed at selectively removing only one type of degradation among multiple composite degradations. Fourth, a permutation-invariant loss is applied to enforce consistent restoration results, regardless of the order in which multiple degradations are removed. By focusing on the training pipeline, our method acts as a versatile enhancement that can be integrated into controllable architectures without requiring their structural redesign. Experimental results demonstrate that our DisIR achieves state-of-the-art performance on composite degradation benchmarks while enabling flexible and selective removal of multiple degradations, either sequentially or in a single step, through a fused embedding with user-controlled intensity ratios.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 17806
Loading