Language models can learn implicit multi-hop reasoning, but only if they have lots of training data

Language models can learn implicit multi-hop reasoning, but only if they have lots of training data

ACL ARR 2025 May Submission2212 Authors

18 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Implicit reasoning is the ability of a language model to solve multi-hop reasoning tasks in a single forward pass, without chain of thought. We investigate this capability using GPT2-style language models trained from scratch on controlled $k$-hop reasoning datasets ($k = 2, 3, 4$). We show that while such models can indeed learn implicit $k$-hop reasoning, the required training data grows exponentially in $k$, and the required number of transformer layers grows linearly in $k$. We offer a theoretical explanation for why this depth growth is necessary. We further find that the data requirement can be mitigated, but not eliminated, through curriculum learning.

Paper Type: Long

Research Area: Question Answering

Research Area Keywords: reasoning,multihop QA,generalization,interpretability

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 2212

Loading