Implicit Chain-of-Thought for Abstractive Text Summarization: A Fast, Accurate Alternative to Explicit CoT
Keywords: Summarization, Chain-of-Thought, Implicit Reasoning, Dialogue Summarization, Curriculum Learning, GPT-Neo, Abstractive Summarization
Abstract: Explicit Chain-of-Thought (CoT) improves LLM reasoning but is slow because models must generate long scratchpad text; No-CoT is fast but less accurate. Recent work shows that Implicit CoT, internalizing reasoning in hidden states and emitting only the final answer, can be nearly as accurate as Explicit CoT while running at No-CoT speeds. Prior studies focus on arithmetic and math word problems; to our knowledge, no work has evaluated Implicit CoT for text summarization. We present the first study of Implicit CoT for abstractive dialogue summarization on the SAMSum dataset using stepwise internalization, comparing No CoT, Explicit CoT, and Implicit CoT trained on GPT-Neo 1.3B. Our results demonstrate that Implicit CoT achieves 98.4% of Explicit CoT's ROUGE-1 performance (0.1929 vs 0.1960) while training 3.6 times faster (3.1 hours vs 11.4 hours), maintaining inference efficiency comparable to No-CoT, and thus bridging the gap between quality and computational efficiency.
Paper Type: Long
Research Area: Summarization
Research Area Keywords: Summarization, Chain-of-Thought, Curriculum Learning, Abstractive Summarization, Dialogue Summarization
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency, Data resources
Languages Studied: English
Submission Number: 3899
Loading