Z-SASLM: Zero-Shot Style-Aligned SLI Blending Latent Manipulation

Published: 20 Dec 2025, Last Modified: 20 Dec 2025CVPR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Style, Alignment, zero-shot, spherical, interpolation, multi-style, blending, latent, space, manipulation, weighted, DINO VIT-B/8
TL;DR: A zero-shot pipeline leveraging spherical linear interpolation for robust style alignment and multi-reference blending in non-Euclidean latent spaces.
Abstract: We introduce Z-SASLM, a Zero-Shot Style-Aligned SLI (Spherical Linear Interpolation) Blending Latent Manipulation pipeline that overcomes the limitations of current multi-style blending methods. Conventional approaches rely on linear blending, assuming a flat latent space leading to suboptimal results when integrating multiple reference styles. In contrast, our framework leverages the non-linear geometry of the latent space by using SLI Blending to combine weighted style representations. By interpolating along the geodesic on the hypersphere, Z-SASLM preserves the intrinsic structure of the latent space, ensuring high-fidelity and coherent blending of diverse styles—all without the need for fine-tuning. We further propose a new metric, Weighted Multi-Style DINO VIT-B/8, designed to quantitatively evaluate the consistency of the blended styles. While our primary focus is on the theoretical and practical advantages of SLI Blending for style manipulation, we also demonstrate its effectiveness in a multi-modal content fusion setting through comprehensive experimental studies. Experimental results show that Z-SASLM achieves enhanced and robust style alignment. The code will be made publicly available upon completion of the review process.
Camera Ready Version: zip
Submission Number: 17
Loading