Low Rank Weight Bases for Visual Analogies

Hila Manor; Rinon Gal; Haggai Maron; Tomer Michaeli; Gal Chechik

Low Rank Weight Bases for Visual Analogies

Hila Manor, Rinon Gal, Haggai Maron, Tomer Michaeli, Gal Chechik

17 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Visual analogy, image analogy, visual relations, image manipulation, low-rank adaptation, LoRA, flow-based generative models, diffusion models, style transfer, image editing

TL;DR: We propose a novel modular framework that learns to dynamically mix low-rank adapters (LoRAs) to improve visual analogy learning, enabling flexible and generalizable image edits based on example transformations.

Abstract: Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations that are difficult to articulate in words. Given a triplet $\\{\mathbf{a}, \mathbf{a}', \mathbf{b}\\}$, the goal is to generate $\mathbf{b}'$ such that $\mathbf{a} : \mathbf{a}' :: \mathbf{b} : \mathbf{b}'$. Recent methods adapt text-to-image models to the analogy task using a single Low-Rank Adaptation (LoRA) module, but they face a fundamental limitation: attempting to capture the diverse space of visual transformations within a fixed adaptation module constrains generalization capabilities. Inspired by recent work showing that LoRAs in constrained domains span meaningful semantic spaces that can be interpolated, we propose LoRBa, a novel approach that specializes the model to each analogy task at inference time through dynamic composition of learned transformation primitives, informally, choosing a point in a "*space of LoRAs*". We introduce two key components: (1) a learnable basis of LoRA modules, to span the space of different types of visual transformations, and (2) a lightweight encoder that dynamically selects and weighs these basis LoRAs based on the specific analogy pair. Through comprehensive evaluations, we demonstrate that our approach achieves state-of-the-art performance and significantly improves generalization to unseen visual transformations. Our findings suggest that LoRA basis decompositions are a promising direction for flexible visual manipulation tasks.

Primary Area: generative models

Submission Number: 8807

Loading