CodePlan: Repository-level Coding using LLMs and Planning

Published: 07 Nov 2023, Last Modified: 08 Nov 2023FMDM@NeurIPS2023EveryoneRevisionsBibTeX
Keywords: LLM, Planning, Coding-tasks
TL;DR: Automating repository-level coding tasks using LLMs by leveraging static analysis powered planning
Abstract: Software engineering activities such as package migration, fixing error reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. While Large Language Models (LLMs) have shown impressive abilities in localized coding tasks, performing interdependent edits across a repository requires multi-step reasoning and planning abilities. We frame repository-level coding as a planning problem and present a task-agnostic, neuro-symbolic framework called CodePlan. Our framework leverages static analysis techniques to discover dependencies throughout the repository, which are utilised in providing sufficient context to the LLM along with determining the sequence of edits required to solve the repository-level task. We evaluate the effectiveness of CodePlan on two repository-level tasks: package migration (C\#) and temporal code edits (Python) across multiple repositories. Our results demonstrate CodePlan consistently beats baselines across tasks. Further qualitative analysis is performed to highlight how different components of the approach contribute in guiding the LLM towards the correct edits as well as maintaining the consistency of the repository.
Submission Number: 97