Beautifying Diffusion Models: Learning Context-Aware Filters for Robust Dense Prediction on Test-Time Corrupted Images

16 Sept 2024 (modified: 13 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion, Test-time adaptation, Dense Prediction, Frequency-Aware Modeling
TL;DR: Conditioning diffusion models with learned, spatially adaptive frequency filters for robust test-time dense classification
Abstract: Diffusion models have enabled input-based domain adaptation to unseen test-time corruption for the classification problem. Nevertheless, while dense prediction tasks share similar robustness issues with image-level classification, previous input adaptation work may fail to preserve the semantic information necessary for robust pixel-level prediction. To address the issue, we propose a novel diffusion-driven strategy that translates the corrupted inputs to the source domain (\ie, the training data domain), while also preserving the semantic information (\ie high-frequency shape information and low-frequency color information). We first studied how to leverage frequency filtering to guide the diffusion generation process and analyze the influence of different filters. From our experiments, we observed that utilizing both high and low spatial frequency information during diffusion driven denoising can substantially improve the adaptation performance of dense classification. This observation motivates us to develop a novel framework, \ie a predictive frequency filtering-driven diffusion (FDD) adaptation, where we predict the filters from the corrupted test-time inputs and use them to guide the diffusion process. We design a Y-like frequency prediction network to predict context-aware low-pass and high-pass filters. To train this network, we propose a novel data augmentation method, FrequencyMix, to generate pairs of clean and corrupted images. We validate our method via extensive experiments on two semantic segmentation datasets and two depth estimation datasets. Against a broad range of common corruptions, we demonstrate that our method is competitive with state of the art work.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1020
Loading