Enhancing Large Language Model Reasoning via Latent Space Modeling for Pre-thinking and Pre-answering

Enhancing Large Language Model Reasoning via Latent Space Modeling for Pre-thinking and Pre-answering

ACL ARR 2026 January Submission9111 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Latent space modeling, chain of thoughts, reasoning

Abstract: Chain-of-Thought (CoT) has proven effective for eliciting reasoning abilities in large language models, but standard autoregressive decoders treat reasoning and answer tokens uniformly as next-token prediction targets, despite their fundamentally different roles. This homogeneous modeling introduces conflicting inductive biases and significant inference overhead. Latent CoT methods attempt to improve efficiency by removing explicit reasoning tokens, yet often degrade performance, indicating that explicit reasoning supervision remains critical for preserving pretrained reasoning capabilities. To address these challenges, we propose Pre$^2$, a phase-aware reasoning framework that decouples thinking and answering while retaining explicit CoT supervision. Our approach introduces independent pre-thinking and pre-answering latent states and integrates them with explicit token representations through a hybrid latent–token decoding paradigm. This design increases computation along both width and depth dimensions with minimal parameter overhead, enabling stronger reasoning without sacrificing inference efficiency. Experiments on one in-domain dataset and four out-of-domain benchmarks demonstrate consistent improvements over strong baselines and robust generalization. Code will be released at github.com

Paper Type: Long

Research Area: Question Answering

Research Area Keywords: Question Answering

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Theory

Languages Studied: English

Submission Number: 9111

Loading