Nonlinear Steering for Token-Efficient Reasoning in LLMs via Flow Matching

Yawei Li; Benjamin Bergner; Yinghan Zhao; Vihang Prakash Patil; Bei Chen; Cheng Wang

Nonlinear Steering for Token-Efficient Reasoning in LLMs via Flow Matching

Yawei Li, Benjamin Bergner, Yinghan Zhao, Vihang Prakash Patil, Bei Chen, Cheng Wang

01 Sept 2025 (modified: 18 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: representation steering; large reasoning models; LRMs; large language models; LLMs; efficient reasoning; flow matching

TL;DR: This paper introduces a nonlinear steering method using Flow Matching to transform verbose reasoning paths into concise ones, achieving superior accuracy and token efficiency in LLMs.

Abstract: Large Reasoning Models (LRMs) excel at complex reasoning tasks, but their efficiency is often hampered by overly verbose outputs. Prior steering methods attempt to address this issue by applying a single, global vector to hidden representations—a rigid approach grounded in the restrictive *linear representation hypothesis*. In this work, we introduce *FlowSteer*, a nonlinear steering method that goes beyond uniform linear shifts by learning a complete *transformation between the distributions* associated with verbose and concise reasoning. This transformation is learned via *Flow Matching* as a velocity field, enabling precise, input-dependent control over the model's reasoning process. Across diverse reasoning benchmarks, *FlowSteer* simultaneously achieves superior accuracy and token efficiency over leading inference-time baselines. Our work demonstrates that modeling the full distributional transport with powerful generative techniques offers a more effective and principled foundation for controlling LRMs.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 36

Loading