Recurrence-Completeness in Transformers and the Computational Role of Chain-of-Thought  in Imitating Recurrence

Recurrence-Completeness in Transformers and the Computational Role of Chain-of-Thought in Imitating Recurrence

ACL ARR 2026 January Submission9993 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Reasoning

Abstract: Transformers achieve strong performance in language modeling by removing recurrence, enabling parallel training and stable optimization. However, this architectural choice limits their computational power, leaving them unable to reliably solve tasks such as counting, reversal, and arithmetic. At the same time, Chain-of-Thought (CoT) prompting dramatically improves reasoning performance in Transformer-based language models. In this work, we analyze the computational roles of recurrence and autoregression in neural models and show that recurrence is essential for increasing reasoning depth. We argue that CoT approximates recurrence by repeatedly encoding and decoding intermediate computational states through natural language, effectively bridging autoregression and recurrent computation. We further revisit recurrent Transformer variants through the lens of recurrence-completeness, identifying fundamental limitations in popular architectures such as Linear Transformers and RWKV. Our results clarify why CoT enhances reasoning and offer principled guidance for designing models with stronger computational capabilities. Experiments are detailed in Appendix.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: LLM, Reasoning

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 9993

Loading