Keywords: Real-World Image Super-Resolution, Direct Preference Optimization
Abstract: Recent advances in diffusion models have improved Real-World Image Super-Resolution (Real-ISR), but lack human feedback integration, risking misalignment with human preference and potentially leading to artifacts, hallucinations, and harmful content generation. To this end, we are the first to introduce human preference alignment into Real-ISR, a technique that has been successfully applied in Large Language Models and Text-to-Image tasks to effectively enhance the alignment of generated outputs with human preferences. Specifically, we introduce Direct Preference Optimization (DPO) into Real-ISR to achieve alignment, where DPO serves as a general alignment technique that directly optimizes from the human preference. Nevertheless, the pixel-level reconstruction objectives of Real-ISR are difficult to reconcile with the image-level preferences of DPO, which can lead to the DPO being overly sensitive to local anomalies, leading to reduced generation quality. To resolve this challenge, we propose Direct Semantic Preference Optimization (DSPO) to align instance-level human preferences by incorporating semantic guidance, which consists of two strategies: (a) semantic instance alignment strategy, implementing instance-level alignment to ensure fine-grained perceptual consistency, and (b) user description feedback strategy, mitigating hallucinations through injecting user semantic textual feedback on instance images as prompt guidance. Our method surpasses both Real-ISR and preference alignment baselines, demonstrating superior performance. As a plug-and-play solution, DSPO performs consistently across one-step and multi-step SR frameworks, highlighting strong generalizability.
Primary Area: generative models
Submission Number: 6195
Loading