Keywords: image generation, identity preservation, controllable generation
TL;DR: SIGMA-Gen is a method for controllable, identity-preserving text-to-image generation with multiple subjects in one diffusion loop.
Abstract: We present SIGMA-Gen, a unified framework for multi-identity preserving image generation. Unlike prior approaches, SIGMA-Gen is the first to enable single-pass multi-subject identity-preserved generation guided by both structural and spatial constraints. A key strength of our method is its ability to support user guidance at various levels of precision — from coarse 2D or 3D boxes to pixel-level segmentations and depth — with a single model. To enable this, we introduce SIGMA-Set27K, a novel synthetic dataset that provides identity, structure, and spatial information for over 100k unique subjects across 27k images. Through extensive evaluation we demonstrate that SIGMA-Gen achieves state-of-the-art performance in identity preservation, image generation quality, and speed.
Supplementary Material: pdf
Primary Area: generative models
Submission Number: 4664
Loading