Improving Code Translation Correctness and Efficiency with Multi-Perspective Exploration and Difference-Aware Selection

Improving Code Translation Correctness and Efficiency with Multi-Perspective Exploration and Difference-Aware Selection

ICLR 2026 Conference Submission19426 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Code Translation, Large Language Models

TL;DR: We present SwiftTrans, a framework that improves both correctness and execution efficiency of LLM-translated code, and demonstrate its superiority to GPT-5 on our newly constructed efficiency-oriented benchmarks.

Abstract: While large language models (LLMs) have greatly advanced the functional correctness of automated code translation systems, the runtime efficiency of translated programs has received comparatively little attention. With the waning of Moore’s law, runtime efficiency has become as critical as functional correctness in evaluating program quality. Our preliminary study reveals that LLM-translated programs often run slower than human-written ones, and this issue cannot be remedied through prompt engineering alone. Therefore, our work proposes SwiftTrans, a code translation framework comprising two key stages: (1) Multi-Perspective Exploration, where MpTranslator leverages parallel in-context learning (ICL) to generate diverse translation candidates; and (2) Difference-Aware Selection, where DiffSelector identifies the optimal candidate by explicitly comparing differences between translations. We further introduce Hierarchical Guidance for MpTranslator and Ordinal Guidance for DiffSelector, enabling LLMs to better adapt to these two core components. To evaluate the runtime efficiency of programs, we extend existing benchmarks, CodeNet and F2SBench, with efficiency-critical test cases and maximum runtime constraints on translated programs. We also introduce SwiftBench, a new benchmark designed to evaluate whether translation models can improve the efficiency of programs when the source code exhibits inefficiencies. Experimental results across all three benchmarks show that SwiftTrans achieves consistent improvements in both correctness and efficiency. Notably, SwiftTrans built on Qwen2.5-7B surpasses current state-of-the-art models such as GPT-5 and training-based F2STrans.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 19426

Loading