Refactoring Codebases through Library Design

Published: 22 Sept 2025, Last Modified: 25 Nov 2025DL4C @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Code, agents, refactoring, compression, library learning
TL;DR: We introduce a benchmark and a method for refactoring multiple programs into reusable libraries, and through a user study find that Minimum Description Length (MDL) best captures what makes a good refactoring.
Abstract: Maintainable and general software allows developers to build robust applications efficiently, yet achieving these qualities often requires refactoring specialized solutions into reusable components. This challenge becomes particularly relevant as code agents become increasingly accurate at solving isolated programming problems. We investigate code agents' capacity to refactor code in ways supporting growth and reusability. We first investigate what makes a good refactoring, finding via a human study that an MDL (Minimum Description Length) objective best aligns with developer preferences for code refactoring quality. We then present both a method and a benchmark for refactoring: LIBRARIAN, a sample-and-rerank method for generating reusable libraries, built on this objective, and MINICODE, a benchmark where code agents must minimize and refactor multiple independent solutions into a joint library. Compared to state-of-the-art code agents, LIBRARIAN achieves strong results on both compression and correctness on MINICODE, obtaining compression rates 1.6-2x better than coding agents while also improving correctness.
Submission Number: 29
Loading