VersiCode: Towards Version-controllable Code Generation

Tongtong Wu; Weigang Wu; Xingyu Wang; Kang Xu; Suyu Ma; Bo Jiang; ping yang; Zhenchang Xing; Yuan-Fang Li; Gholamreza Haffari

VersiCode: Towards Version-controllable Code Generation

Tongtong Wu, Weigang Wu, Xingyu Wang, Kang Xu, Suyu Ma, Bo Jiang, ping yang, Zhenchang Xing, Yuan-Fang Li, Gholamreza Haffari

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Code Generation, API Evolution

TL;DR: We study SOTA LLMs on two code generation tasks incorporating version information into code generation constraints.

Abstract: Large Language Models (LLMs) have made tremendous strides in code generation, but existing research fails to account for the dynamic nature of software development, marked by frequent library updates. This gap significantly limits LLMs' deployment in realistic settings. In this paper, we propose two novel tasks aimed at bridging this gap: version-specific code completion (VSCC) and version-aware code migration (VACM). In conjunction, we introduce VersiCode, a comprehensive Python dataset specifically designed to evaluate LLMs on these two tasks, together with a novel evaluation metric, Critical Diff Check (CDC@1), which assesses code generation against evolving API requirements. We conduct an extensive evaluation of VersiCode, which reveals that version-controllable code generation is indeed a significant challenge, even for GPT-4o and other strong frontier models. We believe the novel tasks, dataset, and metric open up a new, important research direction that will further enhance LLMs' real-world applicability. The code and resources can be found in \url{https://anonymous.4open.science/status/VersiCode-B0F6}.

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6742

Loading