Enhancing Large Language Model Reasoning via Latent Space Modeling for Pre-thinking and Pre-answering
Keywords: Latent space modeling, chain of thoughts, reasoning
Abstract: Chain-of-Thought (CoT) has proven effective for eliciting reasoning abilities in large language models, but standard autoregressive decoders treat reasoning and answer tokens uniformly as next-token prediction targets, despite their fundamentally different roles. This homogeneous modeling introduces conflicting inductive biases and significant inference overhead. Latent CoT methods attempt to improve efficiency by removing explicit reasoning tokens, yet often degrade performance, indicating that explicit reasoning supervision remains critical for preserving pretrained reasoning capabilities.
To address these challenges,
we propose Pre$^2$, a phase-aware reasoning framework that decouples thinking and answering while retaining explicit CoT supervision. Our approach introduces independent pre-thinking and pre-answering latent states and integrates them with explicit token representations through a hybrid latent–token decoding paradigm. This design increases computation along both width and depth dimensions with minimal parameter overhead, enabling stronger reasoning without sacrificing inference efficiency. Experiments on one in-domain dataset and four out-of-domain benchmarks demonstrate consistent improvements over strong baselines and robust generalization. Code will be released at github.com
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: Question Answering
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Theory
Languages Studied: English
Submission Number: 9111
Loading