Prompt-Guided Dynamic Network for Image Super Resolution

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: single image super resolution
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Existing single image super-resolution (SISR) methods learn the convolutional kernel solely from a single image modality. However, the SR performance is limited by the diversity of input modality and the insufficient image-level information in low-resolution images. In this paper, we seek to use multi-modal prompts (texts or images) to assist existing SR networks to learn more discriminative features, leading to superior SR performance. To this end, we develop the Dynamic Correlation Module in a plug-and-play form for existing SR networks, which learns meaningful semantic and textural information from multi-modal prompt embeddings extracted from a large-scale vision-language model (such as CLIP). Specifically, Spatially Multi-Modal Attention Module is proposed to generate the pixel-wise cross-modal attention mask which would highlight the interest regions given certain prompts. Moreover, to the best of our knowledge, we are the first ones that introduce multi-modal prompts into convolutional kernel estimation which can better handle spatial variants and retain cross-modal relevance. Extensive experiments and ablation studies demonstrate the effectiveness of the proposed Dynamic Correlation Module which exploits the discriminative prompt features to recover realistic high-resolution images, elevating existing SR performance by a notable gap.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2204
Loading