Rethinking Code Similarity for Automated Algorithm Design with LLMs

ICLR 2026 Conference Submission24547 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Algorithm Similarity, Automated Algorithm Design, Large Language Model
Abstract: The recent advancement of Large Language Models (LLMs) has revolutionized the algorithm design patterns. A new paradigm, LLM-based Automated Algorithm Design (LLM-AAD), has emerged to generate code implementations for high-quality algorithms. Unlike the traditional expert-driven algorithm development, in the LLM-AAD paradigm, ideas behind the algorithm are often implicitly embedded within the generated code. Therefore, measuring similarity for algorithms may help identify whether a generated algorithm is innovative or merely a syntactic refinement of an existing code implementation. However, directly applying existing code similarity metrics to algorithms raises a critical limitation: they do not necessarily reflect the similarity between algorithms. To address this, we introduce a novel perspective that defines algorithm similarity through the lens of its problem-solving behavior. We represent the problem-solving trajectory of an algorithm as the sequence of intermediate solutions progressively generated by the algorithm. The behavioral similarity is calculated by the resemblance between two problem-solving trajectories. Our approach focuses on how an algorithm solves a problem, not just its code implementation or final output. We demonstrate the utility of our similarity measure in two use cases. (i) Improving LLM-AAD: Integrating our similarity measure into a search method demonstrates promising results across two AAD tasks, proving the effectiveness of maintaining behavioral diversity in the algorithm search. (ii) Algorithm analysis. Our similarity metric provides a new perspective for analyzing algorithms, revealing distinctions in their problem-solving behaviors.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 24547
Loading