Local Autoregression with Finite-Support Random Variables for Image Generation

09 Sept 2025 (modified: 26 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Local Autoregressive Model, Finite-Support Random Variables, Image Generation
TL;DR: We propose the Finite-Support Local Autoregressive (FS-LAR) model, a novel approach based on finite support random variables that capture local pixel dependencies for image generation.
Abstract: We propose the Finite-Support Local Autoregressive (FS-LAR) model, a novel approach based on finite support random variables that capture local pixel dependencies for image generation. Our approach adopts a Frequentist perspective. Instead of imposing priors on the target distribution, we make assumptions in the data processing procedure using an autoencoder. We observe that pixel dependencies are decoupled after reconstruction, despite negligible reconstruction error. In reconstructed images, pixel dependencies rely entirely on the latent representations and the decoder architecture. By designing the decoder architecture, we can control the range of pixel dependencies, which are then modeled by finite support random variables. The generation process performs global sampling based on random variables whose dependencies are controllable, enabling an exponential reorganization of local features in reconstructed images. The proposed approach has several interesting properties. Theoretically, we embrace the empirical distribution, eliminating the need to prevent overfitting. Since the support of the random variables is finite, it is possible to exhaustively search all possible generated images to verify its certifiability. As no prior is imposed on the target distribution, the target distribution is explicitly known and can be fully characterized. Practically, the generation quality is promising compared to state-of-the-art methods, even without using a network in the generation process. Moreover, the proposed approach is able to perform generation with a limited number of images. Finally, the generated images are inherently interpretable, as they are reorganizations of locally independent pixels or patches.
Primary Area: generative models
Submission Number: 3511
Loading