Keywords: agentic, RLHF, alignment, MLLM, KTO, GRPO, DPO, SFT
Abstract: An estimated 2.2 billion people worldwide live with some form of vision impairment. However, while modern models perform well for the general public, they often struggle to communicate effectively with agents under divergent visual perceptions. In this work, we seek to study how we can aligns under divergent visual space using Reinforcement Learning. Because simulating such interactions in the real world is costly, we adopt image reference games as a controlled testbed and design experiments with five distinct perceptual distortions inspired by real human visual impairments (e.g., cataract, color blindness). We evaluate in two settings, online and offline, with four post-training algorithms: SFT, DPO (offline) and KTO, GRPO (online), providing the first systematic study of alignment under divergent visual perceptions. Our results reveal that (i) offline adaptation can provide strong improvements, with DPO consistently outperforming other methods when supported by high-quality preference data; (ii) Online methods can provide a robust alternative in absence of preference dataset and among online adaptation methods, GRPO shows more consistent gains, and (iii) qualitative analysis shows that adapted agents align their descriptions toward perceptual features accessible to their conversation partners. We release our code and dataset.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 129
Loading