Keywords: Style Transfer, Continual Learning, Vector Quantization.
Abstract: While existing artistic style transfer methods enable cross-domain image synthesis, they often struggle to strike a balance among stylistic realism, inference efficiency, and geometric consistency. To address this limitation, we propose a test-time refinement (TTR) framework that universally enhances stylistic fidelity through a self-supervised VQ-GAN, without requiring any gradient updates to the pre-trained generator. Our primary contribution is a continual learning framework for VQ-GAN, which combines Low-Rank Adaptation (LoRA) with incremental codebook expansion. This design enables efficient adaptation to diverse artistic styles while preserving previously learned knowledge, significantly reducing the computational and memory overhead of deploying models across multiple domains. Notably, our approach reduces the number of trainable parameters by up to 94% compared to full-model fine-tuning, offering a highly parameter-efficient solution for test-time refinement. Furthermore, we introduce positional embeddings into the latent embedding space, which strengthens the model's geometry awareness and improves structural coherence in the generated results. We name our approach CLoSeR (Continual Learning in VQ-GAN for Style Refinement), and evaluate it across multiple style transfer benchmarks under a test-time adaptation setting. Experimental results show that CLoSeR improves style fidelity and structural consistency, achieving a maximum relative reduction of 44% in Fréchet Inception Distance (FID), demonstrating significant gains in generation quality. The code will be released.
Primary Area: generative models
Submission Number: 16729
Loading