Language Models are Bounded Pragmatic Speakers

Published: 20 Jun 2023, Last Modified: 29 Jun 2023ToM 2023EveryoneRevisionsBibTeX
Keywords: cognitive modeling, reinforcement learning, large language models
Abstract: How do language models “think”? This paper formulates a probabilistic cognitive model called the bounded pragmatic speaker, which can characterize the operation of different variations of language models. Specifically, we demonstrate that large language models fine-tuned with reinforcement learning from human feedback (Ouyang et al., 2022) embody a model of thought that conceptually resembles a fast-and-slow model (Kahneman, 2011), which psychologists have attributed to humans. We discuss the limitations of reinforcement learning from human feedback as a fast-and-slow model of thought and propose avenues for expanding this framework. In essence, our research highlights the value of adopting a cognitive probabilistic modeling approach to gain insights into the comprehension, evaluation, and advancement of language models.
Supplementary Material: pdf
Submission Number: 32
Loading