Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models
Abstract: Chain-of-Thought (CoT) technique has proven effective in improving the performance of large language models (LLMs) on complex reasoning tasks. However, the performance gains are inconsistent across different tasks, and the underlying mechanism remains a long-standing research question. In this work, we make a preliminary observation that the monotonicity of token probability distributions may be correlated with the gains achieved through CoT reasoning. Leveraging this insight, we propose two indicators based on the token probability distribution to assess CoT effectiveness across different tasks. By combining instance-level indicators with logistic regression model, we introduce Dynamic CoT, a method that dynamically select between CoT and direct answer. Furthermore, we extend Dynamic CoT to closed-source models by transferring decision strategies learned from open-source models. Our indicators for assessing CoT effectiveness achieve an accuracy of 89.2\%, and Dynamic CoT reduces token consumption by more than 35\% while maintaining high accuracy. Overall, our work offers a novel perspective on the underlying mechanisms of CoT reasoning and provides a framework for its more efficient deployment.
Lay Summary: Large language models (LLMs) can solve complex problems, but their ability to reason through difficult tasks isn’t always consistent. One popular technique, called Chain-of-Thought (CoT), helps improve reasoning by encouraging the model to break down its answers step by step. However, the success of this method varies across different tasks, and it's not entirely clear why it works well in some cases but not in others.
In our work, we make an interesting observation: the way the model’s predictions change over time could explain when CoT is more effective. Based on this, we develop two indicators to assess how well CoT works for different tasks. To make things more efficient, we introduce a method called Dynamic CoT, which automatically chooses between using CoT or direct answer based on the task at hand.
This method not only improves the model’s accuracy but also reduces its computational costs by more than 35%. Our work provides new insights into why CoT reasoning works and offers a more efficient way to apply it, helping to improve the performance of LLMs across various tasks.
Link To Code: https://github.com/tsinghua-fib-lab/Token_Signature
Primary Area: Deep Learning->Large Language Models
Keywords: Large Language Model, Chain-of-Thought, Reasoning, Token Probability
Submission Number: 9439
Loading