When We Don’t See the Same Picture: Aligning Agents with Divergent Visual Spaces

When We Don’t See the Same Picture: Aligning Agents with Divergent Visual Spaces

ICLR 2026 Conference Submission410 Authors

01 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Agentic MLLMs, MLLMs, Post-Training, RLHF, Image Reference Game, SFT, Supervised finetuning, Direct preference optimization, Group relative policy optimization, Kahneman-tversky optimization, GRPO, DPO, KTO

TL;DR: A study into aligning agents with divergent visual spaces. One agent sees a distorted view of the world and the other agent have to adapt to it.

Abstract: With the rapid rise of agentic systems, interaction and collaboration between agents has become a central challenge in multimodal large language model (MLLM) research. In this work, we study a unique form of interaction where two agents hold \textit{divergent visual spaces}. To investigate this, we adopt image reference games as a testbed and design experiments with five distinct perceptual distortions inspired by real human visual impairments (e.g., cataract, color blindness). We evaluate in two settings, online and offline, with four post-training algorithms: SFT, DPO (offline) and KTO, GRPO (online), providing the first systematic study of alignment under divergent visual perceptions. Our results reveal that (i) offline adaptation can provide strong improvements, with DPO consistently outperforming other methods when supported by high-quality preference data; (ii) among online adaptation methods, KTO yields more stable and consistent gains across all distortion types and (iii) qualitative analysis shows that adapted agents shift their descriptions toward perceptual features accessible to their conversation partners. Taken together, these findings highlight that offline methods are the preferable solution when supervision is available while online approaches serve as a complementary strategy for dynamic settings where distortions or partner characteristics are unknown in advance. We release code and preference datasets to support future research.

Supplementary Material: pdf

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 410

Loading