Enhancing Photo Animation: Augmented Stylistic Modules and Prior Knowledge Integration

Zhanyi Lu, Ao chen, Yue Zhou

Published: 30 Sept 2024, Last Modified: 11 Mar 2025OpenReview Archive Direct UploadEveryoneRevisionsCC BY 4.0

Abstract: Photo-to-animation translation presents a practical and captivating task within image style transfer. However, existing methods often fall short of achieving satisfactory results in cartoonization. This inadequacy primarily stems from two key factors: the absence of dedicated network architectures tailored for anime-style transfer and the inadequate incorporation of pertinent prior knowledge specific to cartoons. In response to these limitations, this paper introduces a novel deep neural network architecture designed to optimize photo-to-animation translation. Specifically, the proposed framework consists of two pivotal modules: the SCAN module and the Ada-CTSS module, operating at the feature and image levels, respectively, to enhance the desired anime-style effects. We also leverage prior knowledge, encompassing color, texture, and surface aspects, by integrating refined color preservation loss, grayscale style loss, and region smoothness loss. Moreover, to assess the efficacy of our approach, we devise a specialized style evaluation network, circumventing the reliance on conventional evaluation metrics. Through an extensive array of experiments, we demonstrate the superior capabilities of our method in generating high-quality cartoonized images, surpassing the performance of state-of-the-art methods.