PENCIL: Long Thoughts with Short Memory

Chenxiao Yang; Nathan Srebro; David McAllester; Zhiyuan Li

PENCIL: Long Thoughts with Short Memory

Chenxiao Yang, Nathan Srebro, David McAllester, Zhiyuan Li

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: While state-of-the-art LLMs have demonstrated great promise of using long Chains-of-Thought (CoT) to boost reasoning, scaling it up to more challenging problems is fundamentally limited by suboptimal memory usage — intermediate computations accumulate indefinitely in context even no longer needed for future thoughts. We introduce PENCIL, which incorporates a novel reduction mechanism into the autoregressive generation process that recursively clean up intermediate thoughts based on patterns learned from training. By alternately generating and erasing, PENCIL can think deeper to solve harder problems using shorter context and less computes. Empirically, for example, we demonstrate PENCIL with a small 25M-parameter transformer and 2048 context length solves Einstein's puzzle — a task that challenges much larger models like GPT-4. Theoretically, we prove PENCIL can perform universal efficient computation by simulating any Turing machines with optimal time and space complexity, and thus can solve arbitrary computable tasks that are otherwise intractable for vanilla CoT.

Lay Summary: We propose PENCIL, a new LLM reasoning approach that generates and erases thoughts, enabling longer and deeper thinking with shorter context. Theoretically, PENCIL is Turing-complete with optimal space and time complexity, and thus can solve arbitrary computable problems efficiently.

Link To Code: https://github.com/chr26195/PENCIL

Primary Area: Deep Learning->Large Language Models

Keywords: Large language models, chain-of-thought

Submission Number: 13637

Loading