Masked language models directly encode linguistic uncertainty

Cassandra L Jacobs, Ryan J Hubbard

12 Apr 2022OpenReview Archive Direct UploadReaders: Everyone

Abstract: Recent advances in human language processing research have suggested that the predictive power of large language models (LLMs) can serve as cognitive models of human language processing. Evidence for this comes from LLMs’ close fit to human psychophysical data, such as reaction times or brain responses in language comprehension experiments. Those adopting LLM architectures as models of human language processing frame the problem of language comprehension as prediction of the next linguistic event (Goodkind and Bicknell, 2018; Eisape et al., 2020), in particular focusing on lexical or syntactic surprisal. However, this approach fails to consider that comprehenders make predictions using some representation of the content of an utterance. That is, in contrast to surprisal, readers make use of a mental model that creates an evolving understanding of who is doing what to whom and how. In contrast to comprehenders, surprisal measures do not make predictions about the content, as surprisal simply measures the conditional probability of some linguistic event given the surrounding context.

0 Replies