It's my Job to be Repetitive! My Job! My Job! -- Linking Repetitions to In-Context Learning in Language ModelsDownload PDF

Anonymous

16 Oct 2021 (modified: 05 May 2023)ACL ARR 2021 October Blind SubmissionReaders: Everyone
Abstract: Recent studies have shown that large language models can display surprising accuracy at learning tasks from few examples presented in the input context, which goes under the name of in-context learning. Other studies have shown that language models can sometimes display the undesirable behavior of falling back into loops in which an utterance is repeated infinitely often. Here, we observe that the model's capacity to produce repetitions goes well beyond frequent or well-formed utterances, and generalizes to repeating completely arbitrary sequences of tokens. Construing this as a simple form of in-context learning, we hypothesize that these two phenomena are linked through shared processing steps. With controlled experiments, we show that impairing the network from producing repetitions severely affects in-context learning, without reducing its overall predictive performance, thus supporting the proposed hypothesis.
0 Replies

Loading