ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration
Keywords: Large Language Models, Code Generation, Code Library, Retrieval Augmented Generation
TL;DR: We introduce ExploraCoder, a novel framework that empowers code generation LLMs to handle multiple unseen APIs by planning invocation tasks and exploring API usage.
Abstract: Through training on publicly available source code libraries, large language models (LLMs) can invoke multiple encapsulated APIs to solve complex programming problems.
However, existing models inherently cannot generalize to use APIs that are unseen in their training corpora. As libraries continuously evolve, it becomes impractical to exhaustively retrain LLMs with new API knowledge. This limitation hampers LLMs from solving problems which require newly introduced or privately maintained libraries.
Human programmers often explore unfamiliar APIs by writing experimental code before invoking them for a more complex problem.
Inspired by this behavior, we propose $\textbf{ExploraCoder}$, a training-free framework that empowers LLMs to invoke multiple unseen APIs in code solution by (1) planning a complex problem into several API invocation subtasks, and (2) exploring correct API usage through a novel chain-of-API-exploration.
Concretely, ExploraCoder guides the LLM to iteratively generate several experimental API invocations for each simple subtask, where the promising execution experience are exploited by subsequent subtasks. This forms a chained exploration trace that ultimately guides LLM in generating the final solution.
We evaluate ExploraCoder on Torchdata-Github benchmark as well as a newly constructed benchmark that involves more complex API interactions.
Experimental results demonstrate that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24\% over niave RAG approaches and 14.07\% over pretraining methods in pass@10. Moreover, the integration of a self-debug mechanism further boosts ExploraCoder's performance on more challenging tasks. Comprehensive ablation and case studies provide further insights into the effectiveness of ExploraCoder.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 14053
Loading