Keywords: Large Language Models, Human-Computer Interaction, Instruction Following, Controllable Text Generation, Alignment, Inference-time Intervention
Abstract: Aligning Large Language Model (LLM) outputs with user's constraints often requires iterative refinement, yet standard conversational interfaces treat models as black boxes, leading to error propagation and feedback ambiguity. Current interaction paradigms typically force users to regenerate entire responses to correct local errors, resulting in inefficient trial-and-error loops. To address these limitations, we introduce Prompt-with-Steer, an interactive framework that surfaces generation dynamics to enable precise user intervention during inference. Our approach supports prefix rollback regeneration and token-level selection to repair local violations without re-prompting the entire response. We evaluate our framework on an enhanced Instruction Following Evaluation (IFEval) benchmark with verifiable hard constraints. Prompt-with-Steer facilitates higher correction accuracy compared to dialogue-based baselines across different models, achieving high modification success rates on targeted constraints while reducing generation token consumption by up to 70\%. These findings demonstrate that fine-grained, inspectable steering is a more interaction-efficient mechanism for aligning LLM outputs with verifiable generation requirements.
Paper Type: Long
Research Area: Human-AI Interaction/Cooperation and Human-Centric NLP
Research Area Keywords: Generation: interactive and collaborative generation, Human-Centered NLP: human-AI interaction/cooperation, Language Modeling: safety and alignment, Dialogue and Interactive Systems: human-in-the-loop
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 5330
Loading