Keywords: SWE Agents, agentic bug generation, training agents for SWE tasks
TL;DR: We create synthetic bugs by asking an agent to generate features and find that we are able to more efficiently train using these bugs.
Abstract: High quality bugs are key to training the next generation of LLM-based software engineering (SWE) agents.
We introduce a novel method for synthetic generation of difficult and diverse bugs.
Our method instructs SWE Agents to introduce a feature whereby they may unintentionally break tests, resulting in bugs.
Prior approaches often induce an out-of-distribution effect by generating bugs intentionally (e.g. by introducing local perturbation to existing code), which does not reflect realistic development processes.
We do a qualitative analysis to demonstrate that our approach for generating bugs more closely reflects the patterns found in human-authored edits.
Through extensive experiments, we demonstrate that our bugs provide more efficient training data for supervised fine-tuning, outperforming other bug datasets by 2% with half the training data.
Finally, we train with reinforcement learning on our high-quality generated bugs; starting with a strong base model we trained with a mixture of previously available bugs.
We thereby obtain a state-of-the-art 32B parameter model on SWE-Bench Verified, achieving 52.4% pass@1 averaged over three seeds.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 13880
Loading