Panorama Generation with Multiple Custom Background Objects

Jing Shen; Guo HaoDong; Shiming Xiang; Chunlei Huo

Panorama Generation with Multiple Custom Background Objects

Jing Shen, Guo HaoDong, Shiming Xiang, Chunlei Huo

19 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: multi-view image generation, diffusion model

TL;DR: A framework for generating high-quality panoramas with seamlessly integrated multiple custom background objects.

Abstract: While text-to-image and customized generation methods demonstrate strong capabilities in single-image generation, they fall short in supporting immersive applications that require coherent 360° panoramas. Conversely, existing panorama generation models lack customization capabilities. In panoramic scenes, reference objects often appear as minor background elements and may be multiple in number, while reference images across different views exhibit weak correlations. To address these challenges, we propose the first diffusion-based framework for customized multi-view image generation. Our approach introduces a decoupled feature injection mechanism within a dual-UNet architecture to handle weakly correlated reference images, effectively integrating spatial information by concurrently feeding both reference images and noise into the denoising branch. A hybrid attention mechanism enables deep fusion of reference features and multi-view representations. Furthermore, a data augmentation strategy facilitates viewpoint-adaptive pose adjustments, and panoramic coordinates are employed to guide multi-view attention. Experimental results demonstrate our model's effectiveness in generating coherent, high-quality customized multi-view images.

Primary Area: generative models

Submission Number: 18479

Loading