Abstract: In this paper, we propose FreqFormer, a frequency-enhanced framework for face super-resolution (FSR) that synergizes spectral decomposition and dynamic feature modulation to address Transformers' inherent low-frequency bias. Unlike existing methods, FreqFormer leverages Empirical Mode Decomposition (EMD) to decompose high-resolution (HR) images into hierarchical Intrinsic Mode Functions (IMFs), providing explicit frequency anchors for progressive high-frequency recovery. Simultaneously, a lightweight prompt module dynamically injects degradation-aware textures into Transformer blocks through learnable feature interaction, compensating for real-world distortions. This work establishes a novel framework that bridges spectral fidelity and adaptive learning, significantly advancing FSR toward high-frequency-accurate restoration. Experiments on CelebA and Helen datasets demonstrate FreqFormer's superiority, which achieves 22.42 dB PSNR on 16× SR (8× 8 $\rightarrow$ 128× 128) with higher high-frequency energy preservation than state-of-the-art methods. The framework's parameter efficiency and real-time performance enable practical deployment in security and biometric systems.
External IDs:doi:10.1109/lsp.2025.3567809
Loading