Keywords: vision-language model, video face restoration, video processing, language prompt
Verify Author List: I have double-checked the author list and understand that additions and removals will not be allowed after the submission deadline.
Abstract: The video face restoration aims to restore high-quality face video from low-quality face video, but most existing methods typically focus on specific and single degradation scene such as denoising or deblurring. However, the universal video face restoration should restore face video in various degradation scenes. In this paper, we use language prompt which describes the face information including gender, appearance and expression to guide video face restoration. To enhance the applicability, we remove the language prompt by ControlNet and incorporate the human-level knowledge from vision-language models into general networks to improve the video face restoration performance and enable the universal video face restoration. In addition, we construct a degradation dataset, which contains multiple degradations in the same scene and captions which describe the face information. Our extensive experiments show that our approach achieves highly competitive performance in universal video face restoration.
A Signed Permission To Publish Form In Pdf: pdf
Primary Area: Applications (bioinformatics, biomedical informatics, climate science, collaborative filtering, computer vision, healthcare, human activity recognition, information retrieval, natural language processing, social networks, etc.)
Paper Checklist Guidelines: I certify that all co-authors of this work have read and commit to adhering to the guidelines in Call for Papers.
Student Author: Yes
Submission Number: 123
Loading