Geometry-Aligned Tangent-Plane Diffusion Transformers for 360° Panorama Generation

Hakan Çapuk; Andrew Bond; Muhammed Burak Kızıl; Erkut Erdem; Aykut Erdem

Geometry-Aligned Tangent-Plane Diffusion Transformers for 360° Panorama Generation

Hakan Çapuk, Andrew Bond, Muhammed Burak Kızıl, Erkut Erdem, Aykut Erdem

16 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Generative Artificial Intelligence, Image Generation, Panorama Generation

TL;DR: A diffusion transformer approach that aims to generate high quality panorama images by generating a grid of tangent planes and stitching them into a panorama.

Abstract: Generating 360° panoramas from text is challenging due to the inherent difficulty of mapping a 2D diffusion process to a spherical representation without introducing visual artifacts, inconsistencies, or a lack of global coherence. We present TanDiT, a tangent-plane diffusion transformer that factorizes the sphere into locally planar patches, providing a geometry-aligned representation where a pretrained DiT backbone operates without architectural changes. A lightweight ERP-conditioned refinement stage harmonizes overlaps and improves global coherence. To better evaluate panorama quality, we introduce TangentFID and TangentIS, distortion-aware metrics that capture pole and seam degradations, and align closely with human preference. Experiments across multiple benchmarks show that TanDiT outperforms prior work in both perceptual quality and distortion-sensitive fidelity, while scaling efficiently to 4K resolution. Ablations confirm that the main gains arise from the representational choice, establishing TanDiT as a simple and principled framework for high-fidelity panorama generation.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 8056

Loading