Abstract: Abstract reasoning is a key ability for an intelligent system. Large language models achieve abovechance performance on abstract reasoning tasks, but exhibit many imperfections. However, human
abstract reasoning is also imperfect, and depends on our knowledge and beliefs about the content of
the reasoning problem. For example, humans reason much more reliably about logical rules that are
grounded in everyday situations than arbitrary rules about abstract attributes. The training experiences
of language models similarly endow them with prior expectations that reflect human knowledge and
beliefs. We therefore hypothesized that language models would show human-like content effects on
abstract reasoning problems. We explored this hypothesis across three logical reasoning tasks: natural language inference, judging the logical validity of syllogisms, and the Wason selection task (Wason,
1968). We find that state of the art large language models (with 7 or 70 billion parameters; Hoffmann
et al., 2022) reflect many of the same patterns observed in humans across these tasks — like humans,
models reason more effectively about believable situations than unrealistic or abstract ones. Our findings have implications for understanding both these cognitive effects, and the factors that contribute
to language model performance.
0 Replies
Loading