Directional Textual Inversion for Personalized Text-to-Image Generation

Published: 01 Mar 2026, Last Modified: 05 Apr 2026TTU at ICLR 2026 (Main)EveryoneRevisionsBibTeXCC BY 4.0
Abstract: Textual Inversion (TI) is efficient for text-to-image personalization but often fails on complex prompts. We identify embedding norm inflation as a key cause and show that token semantics are primarily encoded by embedding direction. We propose Directional Textual Inversion (DTI), which fixes the embedding magnitude to an in-distribution scale and optimizes only direction with a simple MAP objective using a von Mises-Fisher prior. DTI improves prompt fidelity over existing embedding optimization baselines while maintaining competitive subject similarity. Furthermore, we demonstrate its ease of integration and creative applications.
Submission Number: 55
Loading