Abstract: Face super-resolution aims to reconstruct a high-resolution face image from a low-resolution face image. Previous methods typically employ an encoder-decoder structure to extract facial structural features, where the direct downsampling inevitably introduces distortions, especially to high-frequency features such as edges. To address this issue, we propose a wavelet-based feature enhancement network, which mitigates feature distortion by losslessly decomposing the input facial feature into high-frequency and low-frequency components using the wavelet transform and processing them separately. To improve the efficiency of facial feature extraction, a full domain Transformer is further proposed to enhance local, regional, and global low-frequency facial features. Such designs allow our method to perform better without stacking many network modules as previous methods did. Extensive experiments demonstrate that our method effectively balances performance, model size, and inference speed. All code and data will be released upon acceptance.
Primary Subject Area: [Content] Media Interpretation
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: Face super-resolution (FSR) aims to reconstruct a high-resolution face image from a low-resolution face image. This technology can effectively improve the quality of multimedia content involving face images. In contrast to existing encoder-decoder-based FSR methods, our method mitigates the issue of facial feature corruption induced by the downsampling process. Consequently, it can handle the FSR task more efficiently than the existing methods.
Supplementary Material: zip
Submission Number: 2504
Loading