Robust Instruction Following via Reinforcement Learning

ACL ARR 2026 January Submission2362 Authors

02 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Instruction Following, Reinforcement Learning
Abstract: Instruction following has become a cornerstone capability of Large Language Models (LLMs) since the advent of ChatGPT. Despite its significance, we observe that existing training paradigms often lead to a catastrophic degradation in multi-step reasoning performance. This phenomenon is counter-intuitive, as robust instruction adherence should theoretically augment, rather than hinder, general task performance. To bridge this gap, we propose **Ro**bust **I**nstruction **F**ollowing via **R**einforcement **L**earning (**RoIFRL**), a novel RL training framework. RoIFRL reframes instruction following as a composite task consisting of _constraint adherence_ and _sequential execution_. Furthermore, it introduces a two-dimensional curriculum that dynamically orchestrates data sampling from two subtasks by aligning the data mixture and the instruction complexity with the policy model’s evolving capabilities. We synthesize a comprehensive set of verifiable RL training data covering both subtasks. Extensive experiments and ablation studies demonstrate that RoIFRL not only preserves but further enhances multi-step reasoning while significantly improving standard instruction-following benchmarks, like IFEval.
Paper Type: Short
Research Area: Language Models
Research Area Keywords: fine-tuning, safety and alignment, robustness
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 2362
Loading