Bilinear relational structure fixes reversal curse and enables consistent model editing

ICLR 2026 Conference Submission21044 Authors

Published: 26 Jan 2026, Last Modified: 26 Jan 2026ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: model editing, reversal curse, language model, relational knowledge, knowledge editing
TL;DR: Language models can learn to encode relational knowledge in a bilinear relational structure, a mechanism that directly can mitigate the reversal curse and enable cosistent model editing.
Abstract: The reversal curse---a language model's (LM) inability to infer an unseen fact ``B is A'' from a learned factA is B''---is widely considered a fundamental limitation. We show that this is not an inherent failure but an artifact of how models encode knowledge. By training LMs from scratch on a synthetic dataset of relational knowledge graphs, we demonstrate that bilinear relational structure emerges in their hidden representations. This structure is associated with alleviating the reversal curse, facilitating the inference of unseen reverse facts. Crucially, we also find that this bilinear structure plays a key role in consistent model editing. When a fact is updated in a LM with this structure, the edit correctly propagates to its reverse and other logically dependent facts. In contrast, models lacking this representation not only suffer from the reversal curse but also fail to generalize edits, further introducing logical inconsistencies. Our results establish that training on a relational knowledge dataset induces the emergence of bilinear internal representations, which in turn support LMs in behaving in a logically consistent manner after editing. This implies that the success of model editing may be tied not just to editing algorithms but to the underlying representational geometry of the knowledge being modified.
Primary Area: interpretability and explainable AI
Submission Number: 21044
Loading