Inferring Attribute Subspaces from Visual Contexts

Inferring Attribute Subspaces from Visual Contexts

ICLR 2026 Conference Submission17019 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: visual attributes, diffusion, attribute subspace

TL;DR: We propose Attribute Subspace Inference Tasks and develop a training setup that enables generative models to infer shared semantic attributes from just a few example images without labels, and to generate attribute-consistent images.

Abstract: Recent advances in generative vision-language models have demonstrated remarkable capabilities in image synthesis, captioning, and multi-modal reasoning. Among their most intriguing behaviors is in-context learning, the ability to adapt to new tasks from just a few examples. While well-studied in language models, this capability remains underexplored in the visual domain. Motivated by this, we explore how generative vision models can infer and apply visual concepts directly from image sets, without relying on text or labels. We frame this as an attribute subspace inference task: given a small set of related images, the model identifies the shared variation and uses it to guide generation from a query image. During training, we use auxiliary groupings to provide weak structural supervision. At inference time, the model receives only unlabeled inputs and must generalize the visual concept based on example images alone. Our approach enables attribute-consistent image generation and contributes a novel direction for nonverbal concept learning in vision.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 17019

Loading