Extending Fill-In-the-Middle with Instructions: Another "Free Lunch" for Code Completion Models

ACL ARR 2026 January Submission7814 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: code completion, instruction tuning
Abstract: Code completion models often fail when the developer's intent is under-specified in the code context. To mitigate this, developers frequently use natural language comments to clarify objectives. However, current code completion models fail to prioritize these directives effectively since they are merely pre-trained using the Fill-In-the-Middle (FIM) objective. On the one hand, the natural language instructions, mixed with the noisy code comments, are just treated as part of the background context within the prefix. On the other hand, the pre-training datasets for the FIM objective are mostly sourced from open-source repositories, which results in a scarcity of high-intent instruction-to-code pairings that reflect the developers' workflow in code completion. To bridge this gap, we propose Instruction-aware Fill-In-the-Middle (IFIM), a training method that extends the FIM structure with a dedicated instruction section. Our evaluation results indicate that IFIM significantly enhances model adherence to user intent, increasing the Pass@1 score from 84.6\% to 93.6\% without compromising underlying infilling performance. These findings suggest that IFIM offers a cost-effective "free lunch" for improving the steerability and practical utility of code completion systems. The artifacts are released at https://anonymous.4open.science/r/ifim-artifacts.
Paper Type: Long
Research Area: Code Models
Research Area Keywords: code completion
Contribution Types: Model analysis & interpretability
Languages Studied: Python
Submission Number: 7814
Loading