Track: regular paper (up to 6 pages)
Keywords: Knowledge conflict, multilingual language models, multimodal language models
Abstract: Language models (LMs) deployed in real-world tasks -- such as medical report synthesis, web navigation, and summarization -- must process diverse inputs and handle conflicting information. Users expect them to detect *in-context knowledge conflicts* -- direct contradictions about objective facts -- and issue alerts. Yet, we find a critical failure: when faced with conflicting evidence across **heterogeneous contexts**, such as multiple languages or modalities, LMs fail to detect conflicts, leaving them vulnerable to attacks and misinformation. While they achieve near-perfect accuracy in homogeneous contexts, this **drops by up to 65%** in heterogeneous settings. We identify *context imbalance* as the root cause: LMs exhibit extreme attention asymmetry across domains, disproportionately prioritizing certain domains in mixed inputs. Current instruction-tuning, which trains on separate examples from multiple domains, fails to correct this. To address this, we need *instance-level diverse points* that require reasoning over multiple domains within a single context. We introduce **Heterogeneous Instruction-Tuning (HeteroIT)**, a scalable dataset-mixing procedure that generates instance-level diversity by combining datasets from different domains. Applying HeteroIT to Bactrian-X, a standard multilingual instruction-tuning dataset, improves conflict detection by 37%.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Format: Yes, the presenting author will definitely attend in person because they attending ICLR for other complementary reasons.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Presenter: ~Chen_Henry_Wu1
Submission Number: 25
Loading