Abstract: Addressing code vulnerabilities is crucial for software security and reliability. We present SynthFix, an innovative framework for automated code repair that combines Supervised Fine-Tuning (SFT) with Proximal Policy Optimization (PPO) in an iterative training regime. Inspired by optimization strategies from statistical algorithms, SynthFix balances the rapid pattern recognition of SFT with the adaptive learning of PPO. By incorporating compiler insights, such as Abstract Syntax Trees (AST), Control Flow Graphs (CFG), and ESLint, SynthFix enhances training dynamics, improving scalability and adaptability. Evaluation on the FixJS dataset with over 30k JavaScript code pairs demonstrates that SynthFix outperforms existing methods, achieving up to 7.78% improvement in CodeBLEU over SFT and 7.33% over PPO on the CodeT5 and CodeGen models. SynthFix further shows substantial gains in Exact Match, achieving up to 2.16x improvement. This innovative training architecture outperforms traditional models and shows potential for advancing other software engineering tasks through feedback adjustments. The code for SynthFix has been anonymized and can be found at https://github.com/iiiiiiii979/SynthFix.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: distillation, parameter-efficient-training, reinforcement learning, optimization methods, generative models, robustness, adversarial training, pre-training, continual learning, human-in-the-loop
Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models
Languages Studied: JavaScript
Submission Number: 333
Loading