Step Rejection Fine-Tuning: A Practical Distillation Recipe

Published: 16 Jun 2026, Last Modified: 24 Jun 2026ICML 2026 Workshop DL4C PosterEveryoneRevisionsBibTeXCC BY-NC 4.0
Keywords: Post-training, Supervised Fine-Tuning, Knowledge Distillation, LLM Agents, Rejection Sampling, Data Efficiency, Step-level Supervision, Trajectory Filtering, Critic Models, Software Engineering
TL;DR: We propose Step Rejection Fine-Tuning (SRFT), a method that improves LLM agent training by selectively masking harmful steps in failed trajectories instead of discarding the entire attempt, allowing the model to learn from partial successes.
Abstract: Rejection sampling Fine-Tuning (RFT) is a standard method for training LLM agents, where unsuccessful trajectories are discarded from the training set. In the context of SWE-bench tasks, this corresponds to filtering out runs where the submitted patch does not pass the tests. However, this approach discards unresolved trajectories, even though they form a large portion of all trajectories for hard tasks and even then may be partially correct. In this work, we propose Step Rejection Fine-Tuning (SRFT)—a practical way to leverage these unresolved trajectories. For this, we employ a critic LLM to assess the correctness of each step in a trajectory. Consequently, during training, we mask the loss for erro- neous steps while retaining them in the context window. This way we ensure the model learns to recover from errors without reproducing them. Evaluation on SWE-bench Verified shows that while RFT improves the resolution rate by 2.4% by excluding unresolved trajectories, SRFT improves it by 3.7% by filtering them instead of discarding completely, reaching the total resolution rate of 32.2%.
Submission Number: 19
Loading