Single-pass Adaptive Image Tokenization for Minimum Program Search

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Adaptive Tokenization; Representation Learning; Intelligent Compression;
TL;DR: One-Shot Adaptive Visual Tokenizer
Abstract: According to Algorithmic Information Theory (AIT), intelligent representations compress data into the shortest possible program while remaining predictive of its content—exhibiting low Kolmogorov Complexity (KC). In contrast, most visual representation learning systems assign fixed-length representations to all inputs, ignoring variations in complexity or familiarity. Recent adaptive tokenization methods address this by allocating variable-length representations but typically require test-time search over multiple hypotheses to identify the most predictive one. Inspired by KC principles, we propose a one-shot adaptive tokenizer, KARL, that predicts the appropriate number of tokens for an image in a single forward pass, halting once its approximate KC is reached. The token count serves as a proxy for the minimum description length. KARL performs comparably to recent adaptive tokenizers while operating in a one-pass manner. Additionally, we present a conceptual study showing a correlation between adaptive tokenization and core ideas from AIT. We demonstrate that adaptive tokenization not only aligns with KC but also reveals empirical signals approximating AIT concepts such as sophistication and logical depth. Finally, we analyze predicted image complexity and interestingness across axes such as structure vs. noise and in-distribution vs. out-of-distribution familiarity, highlighting alignment with human annotations.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 8417
Loading