Abstract: Diffusion models have transformed image generation, yet
controlling their outputs for diverse applications, including content moderation and creative customization, remains
challenging. Existing approaches usually require taskspecific training and struggle to generalize across both concrete (e.g., objects) and abstract (e.g., styles) concepts. We
propose CASteer (Cross-Attention Steering) a training-free
framework for controllable image generation using steering
vectors to influence a diffusion model’s hidden representations dynamically. CASteer computes these vectors offline
by averaging activations from concept-specific generated
images, then applies them during inference via a dynamic
heuristic that activates modifications only when necessary,
removing concepts from affected images or adding them to
unaffected ones. This approach enables precise control over
a wide range of tasks, including removing harmful content, adding desired attributes, replacing objects, or altering styles, all without model retraining. CASteer handles both concrete and abstract concepts, outperforming stateof-the-art techniques across multiple diffusion models while
preserving unrelated content and minimizing unintended effects. The code is available at
https://github.com/Atmyre/CASteer.
Loading