What is an "Abstract Reasoner"? Revisiting Experiments and Arguments about Large Language Models

Published: 24 May 2025, Last Modified: 24 May 2025CoNLL 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Abstract Reasoning, Large Language Models, Multimodal Large Language Models
TL;DR: We show that tuning only the token embedding of LLMs improves their abstract reasoning capabilities and (re-)open the discussion about what it means to be an abstract reasoner.
Abstract: Recent work has argued that large language models (LLMs) are not "abstract reasoners", citing their poor zero-shot performance on a variety of challenging tasks as evidence. We revisit these experiments in order to add nuance to the claim. First, we show that while LLMs indeed perform poorly in a zero-shot setting, even tuning a small subset of parameters for input encoding can enable near-perfect performance. However, we also show that this finetuning does not necessarily transfer across datasets. We take this collection of empirical results as an invitation to (re-)open the discussion of what it means to be an "abstract reasoner", and why it matters whether LLMs fit the bill.
Supplementary Material: pdf
Submission Number: 100
Loading