LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas

Guocheng Qian; Ruihang Zhang; Tsai-Shien Chen; Yusuf Dalva; Anujraaj Goyal; Willi Menapace; Ivan Skorokhodov; Daniil Ostashev; Meng Dong; Arpit Sahni; Ju Hu; Sergey Tulyakov; Kuan-Chieh Jackson Wang

LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas

Guocheng Qian, Ruihang Zhang, Tsai-Shien Chen, Yusuf Dalva, Anujraaj Goyal, Willi Menapace, Ivan Skorokhodov, Daniil Ostashev, Meng Dong, Arpit Sahni, Ju Hu, Sergey Tulyakov, Kuan-Chieh Jackson Wang

10 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion Models, Generative Models, Personalized Text-to-Image

TL;DR: LayerComposer enables Photoshop-like control for multi-subject text-to-image generation, allowing users to compose scenes by placing, resizing, and locking elements in a layered canvas with high fidelity.

Abstract: Despite their impressive visual fidelity, existing personalized generative models lack interactive control over spatial composition and scale poorly to multiple subjects. To address these limitations, we present \textit{LayerComposer}, an interactive framework for personalized, multi-subject text-to-image generation. Our approach introduces two main contributions: (1) a \textit{layered canvas}, a novel representation in which each subject is placed on a distinct layer, enabling occlusion-free composition; and (2) a \textit{locking mechanism} that preserves selected layers with high fidelity while allowing the remaining layers to adapt flexibly to the surrounding context. Similar to professional image-editing software, the layered canvas allows users to \textit{place}, \textit{resize}, or \textit{lock} input subjects through intuitive layer manipulation. Our versatile locking mechanism requires no architectural changes, relying instead on inherent positional embeddings combined with a complementary data sampling strategy. Extensive experiments demonstrate that \textit{LayerComposer} achieves superior spatial control and identity preservation compared to the state-of-the-art methods in human-centric personalized image generation.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 3808

Loading