Recurrence-Completeness in Transformers and the Computational Role of Chain-of-Thought in Imitating Recurrence

ACL ARR 2026 January Submission9993 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, Reasoning
Abstract: Transformers achieve strong performance in language modeling by removing recurrence, enabling parallel training and stable optimization. However, this architectural choice limits their computational power, leaving them unable to reliably solve tasks such as counting, reversal, and arithmetic. At the same time, Chain-of-Thought (CoT) prompting dramatically improves reasoning performance in Transformer-based language models. In this work, we analyze the computational roles of recurrence and autoregression in neural models and show that recurrence is essential for increasing reasoning depth. We argue that CoT approximates recurrence by repeatedly encoding and decoding intermediate computational states through natural language, effectively bridging autoregression and recurrent computation. We further revisit recurrent Transformer variants through the lens of recurrence-completeness, identifying fundamental limitations in popular architectures such as Linear Transformers and RWKV. Our results clarify why CoT enhances reasoning and offer principled guidance for designing models with stronger computational capabilities. Experiments are detailed in Appendix.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: LLM, Reasoning
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 9993
Loading