Towards Self-Improving Language Models for Code Generation

Published: 11 Mar 2024, Last Modified: 15 Mar 2024LLMAgents @ ICLR 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: language model, self-improvement, expert iteration, code generation
Abstract: Language models for code generation improve predictably with the size of their training dataset, but corpora of high-quality human-written code are inevitably a finite, or at least slow-growing, resource. It is therefore desirable to find ways to train language models to improve autonomously, ideally starting from scratch. In this paper, we present a method that combines search and learning in an expert iteration scheme to improve code generation models without requiring any human-written code. Solutions to programming problems found by neurally-guided search provide training data for the language model. As training progresses, the language model increasingly internalizes knowledge about programming, enabling more efficient search, thereby solving harder problems. Using small, randomly initialized language models, we study how different design choices, such as the nature of the search procedure, the difficulty of the programming problems, and the tradeoff between training and search compute, influence the rate of learning progress.
Submission Number: 38