Research Area: Science of LMs, Compute efficient LMs, Learning algorithms for LMs
Keywords: LLMs, Large Language Models, Question Answering, Generalization, Knowledge Representation, Logical Inference, Relations
TL;DR: When trained on “A has a feature B”, LLMs do not generalize to “B is a feature of A”, which is termed the Reversal Curse. This work proposes an alternative training scheme, called reverse training, that resolves the Reversal Curse.
Abstract: Large language models (LLMs) have a surprising failure: when trained on ``A has a feature B``, they do not generalize to ``B is a feature of A``, which is termed the Reversal Curse. Even when training with trillions of tokens this issue still appears due to Zipf's law -- hence even if we train on the entire internet. This work proposes an alternative training scheme, called $reverse$ $training$, whereby all words are used twice, doubling the amount of available tokens. The LLM is trained in both forward and reverse directions by reversing training strings while preserving (i.e., not reversing) chosen substrings, such as entities. We show that data matched reverse-trained models provide superior performance to standard models on standard tasks, and compute matched reverse-trained models provide far superior performance on reversal tasks, helping resolve the reversal curse issue.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 42
Loading