FoCus: Improving Faithfulness in Chain-of-Thoughts by Training on Structured Reasoning Data

Published: 17 Oct 2025, Last Modified: 21 Nov 2025MATH-AI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: faithfulness, chain-of-thought, large language models, reasoning, interpretability, mathematical reasoning
TL;DR: FoCus grounds reasoning on explicit problem conditions, boosting Chain-of-Thought faithfulness by up to 31% across reasoning benchmarks.
Abstract: Chain-of-Thought (CoT) prompting improves interpretability of large language models (LLMs) but often lacks faithfulness, yielding post-hoc rationalizations that can be unreliable. To address this issue, we propose FoCus, a condition-utilized framework that enumerates problem conditions and grounds reasoning on them. Using a two-stage pipeline, FoCus generates faithful reasoning traces to fine-tune LLMs. On four reasoning benchmarks, FoCus improves average faithfulness—by up to 22.95% for DeepSeek-Qwen3-8B, 31.05% for Nemotron-7B, and 29.4% for Qwen3-8B—over both normal (original) models and prompt-engineered baselines. These findings demonstrate that explicit condition grounding is an effective strategy for enhancing faithful reasoning in LLMs.
Submission Number: 141
Loading