DiverseAR: Boosting Diversity in Bitwise Autoregressive Image Generation

ICLR 2026 Conference Submission12852 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generative Model, Autoregressive Image Generation, Image Synthesis
Abstract: In this paper, we investigate the underexplored challenge of sample diversity in autoregressive (AR) generative models with bitwise visual tokenizers. We initially analyze the factors limiting diversity in bitwise AR models and identify two key issues: \textbf{1)} the binary classification nature of bitwise modeling, which restricts the prediction space, and \textbf{2)} the overly-sharp logits distribution, which causes sampling collapse and reduces diversity. Built on these insights, we propose \textbf{DiverseAR}, a principle and effective method that enhances image diversity without sacrificing visual quality. Specifically, we introduce an adaptive logits distribution scaling mechanism that dynamically adjusts the sharpness of the binary output distribution across different sampling steps, resulting in a smoother prediction distribution and improved diversity. To mitigate the potential fidelity loss caused by distribution smoothing, we further develop an energy-based generation path search algorithm that avoids sampling low-confidence tokens, thereby preserving high visual quality. Extensive experiments highlight that DiverseAR can unlock greater diversity in bitwise autoregressive image generation.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 12852
Loading