Generation by Search: Scaling Test-Time Compute for Autoregressive Image Generation

Zhitong Gao; Parham Rezaei; Ali Cy; Nataša Jovanović; Mingqiao Ye; Jesse Allardice; Afshin Dehghan; Roman Bachmann; Oğuzhan Fatih Kar; Amir Zamir

Generation by Search: Scaling Test-Time Compute for Autoregressive Image Generation

Zhitong Gao, Parham Rezaei, Ali Cy, Nataša Jovanović, Mingqiao Ye, Jesse Allardice, Afshin Dehghan, Roman Bachmann, Oğuzhan Fatih Kar, Amir Zamir

18 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: image generation, test-time scaling, search algorithm, autoregressive model

TL;DR: We reframe image generation as a search problem, where searching over tokens enables direct generation, improves controllability, augments autoregressive models, and offers practical guidance for test-time scaling.

Abstract: Image generation has made significant progress in recent years, but still faces difficulties in following challenging prompts and poor scalability with respect to inference-time computation. In this paper, we propose framing autoregressive image generation as a search problem, where the objective is to identify token sequences that maximize a chosen utility function at the test time. This framework enables fine-grained control over the generation process through flexible choices of utility functions and yields better scaling behavior as more test-time compute is used. Moreover, it is fully compatible with existing autoregressive generative models, which can be viewed as providing token-level priors during the search. To systematically investigate this framework, we organize the design space into four key axes and conduct studies across the choices of token structure (2D grid, 2D multi-scale, and 1D ordered), search algorithm (best-of-N, beam search, and lookahead search), verifier (optimizing for image-text and image-image alignment, as well as quality), and prior model (conditional and unconditional autoregressive models, and prior-free). Together, these findings establish search as a performant, controllable, and scalable approach to advancing image generation and provide practical guidance for future work in this direction.

Primary Area: generative models

Submission Number: 10076

Loading