AMPS: Adaptive Modality Preference Steering via Functional Entropy

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Modality Preference, Model Steering, Multimodal Large Language Models
Abstract: Multimodal Large Language Models (MLLMs) often exhibit significant modal- ity preference, which is a tendency to favor one modality over another. Prior work has applied steering methods to adjust the modality preference of MLLMs. However, these conventional approaches apply a uniform steering intensity to all samples. This lack of adaptation is problematic because strong steering can dis- rupt a model’s standard inference capabilities, leading to high error rates, while weak steering may be ineffective. To address this limitation, a sample-wise diag- nostic tool is required to measure MLLMs’ susceptibility to steering across differ- ent multimodal samples. To reduce the disruption of strong steering to MLLMs’ inference capabilities, we first introduce a diagnostic metric that quantifies the in- formation contribution ratio from each modality in MLLMs. This metric reveals varying susceptibility to steering across different samples. Building on these di- agnostic insights, we further propose a steering scaling strategy that applies lower steering intensity for samples highly sensitive to steering, and design a learnable steering module that automatically learns appropriate scaling patterns, enabling context-aware adjustment of modality preference. Experimental results show that our context-aware scaling method outperforms conventional steering strategies in modulating modality preference, achieving effective adjustment while signif- icantly reducing generation errors.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 22853
Loading