Track: long paper (up to 10 pages)
Keywords: Chain-of-Thought, Parity Learning, Implicit CoT
Abstract: Chain-of-Thought (CoT) prompting can substantially improve the sample efficiency of Large Language Models (LLMs), reducing the complexity of tasks like parity learning from exponential to polynomial. However, this benefit comes at the cost of generating explicit reasoning steps, which is computationally expensive during inference. Implicit CoT (ICoT) has been proposed to mitigate this cost by training models to internalize these intermediate steps. In this work, we provide a theoretical analysis showing that ICoT retains the sample efficiency gains of explicit CoT, enabling models to solve complex tasks efficiently without sacrificing inference speed. Moreover, we propose Log-ICoT, a provably efficient curriculum that reduces training stages from linear dependency on the problem complexity to a logarithmic one. Our theoretical results are verified with numerical experiments, confirming that ICoT offers a path to robust reasoning without the high computational overhead.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 89
Loading