Neural Rankers for Code Generation via Inter-Cluster Modeling

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: visualization or interpretation of learned representations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: code generation, ai4code, reranking, pass@k, ml4code, codellm, code large language model
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A novel reranking method for code generation by modeling the relationships between cluster of solutions. Achieved SOTA performance on HumanEval and MBPP.
Abstract: Code Large Language Models (CodeLLMs) have ushered in a new era of code generation advancements. However, selecting the best solutions from among all possible CodeLLM solutions remains a challenge. Previous methods frequently overlooked the intricate functional similarities and interactions between clusters, resulting in suboptimal results. In this work, we introduce SRank, a novel rerank- ing strategy for selecting the best solution from code generation that focuses on modeling inter-cluster relationship. By quantifying the functional overlap between clusters, our approach provides a better ranking strategy of code solutions. Empir- ical results show that our method achieves a remarkable results on pass@1 score. For instance, on the Human-Eval benchmark, we achieve 69.66% in pass@1 with Codex002, 75.31% for WizardCoder, 53.99% for StarCoder and 60.55% for Code- Gen, which surpass the state-of-the-arts solution ranking methods, such as CodeT and Coder-Reviewer on the same CodeLLMs with significant margin (≈ 6.1% improvement on average). Comparing to the random sampling method, we can achieve an average improvement of ≈ 23.07% on Human-Eval. Even in scenar- ios with limited test inputs, our approach demonstrates robustness and superiority, marking a new benchmark in code generation reranking.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8325
Loading