CASteer: Steering Diffusion Models for Controllable Generation

Published: 17 Mar 2025, Last Modified: 09 Apr 2025arxivEveryoneCC BY 4.0
Abstract: Diffusion models have transformed image generation, yet controlling their outputs for diverse applications, including content moderation and creative customization, remains challenging. Existing approaches usually require taskspecific training and struggle to generalize across both concrete (e.g., objects) and abstract (e.g., styles) concepts. We propose CASteer (Cross-Attention Steering) a training-free framework for controllable image generation using steering vectors to influence a diffusion model’s hidden representations dynamically. CASteer computes these vectors offline by averaging activations from concept-specific generated images, then applies them during inference via a dynamic heuristic that activates modifications only when necessary, removing concepts from affected images or adding them to unaffected ones. This approach enables precise control over a wide range of tasks, including removing harmful content, adding desired attributes, replacing objects, or altering styles, all without model retraining. CASteer handles both concrete and abstract concepts, outperforming stateof-the-art techniques across multiple diffusion models while preserving unrelated content and minimizing unintended effects. The code is available at https://github.com/Atmyre/CASteer.
Loading