Deterministic vs. LLM-Controlled Orchestration for COBOL-to-Python Modernization

Naing Oo Lwin; Rajesh Kumar

Deterministic vs. LLM-Controlled Orchestration for COBOL-to-Python Modernization

Naing Oo Lwin, Rajesh Kumar

Published: 28 Mar 2026, Last Modified: 08 May 2026AIware 2026EveryoneRevisionsCC BY 4.0

Keywords: Legacy Code Modernization, COBOL-to-Python Translation, LLM-based Software Engineering, Agentic AI

Abstract: Modernizing legacy COBOL systems remains difficult due to scarce expertise, large and long-lived codebases, and strict correctness requirements. Recent large language model (LLM)-based modernization systems increasingly rely on agentic workflows in which the model controls multi-step tool execution. However, it remains unclear whether delegating execution control to the LLM improves correctness, robustness, or efficiency in structured software engineering workflows. We present a controlled empirical study of deterministic and LLM-controlled orchestration for COBOL-to-Python modernization. Using a unified experimental framework, we hold the language models, prompts, tools, configurations, and source programs constant while varying only the execution control strategy. This isolates orchestration as the sole experimental variable. We evaluate both approaches using functional correctness, robustness across repeated stochastic runs, and computational efficiency. Across multiple models, deterministic orchestration achieves comparable computational accuracy to LLM-controlled orchestration while improving worst-case robustness and reducing performance variability across runs. Deterministic execution also reduces token consumption by up to 3.5x, leading to substantially lower operational cost. These results suggest that, in structured modernization workflows with explicit validation stages, fixed execution policies provide more stable and cost-efficient behavior than fully agentic orchestration without reducing translation quality.

Revision Summary: Revised and proofread the manuscript for clarity, grammar, readability, and conciseness. Reduced verbosity and repetitive phrasing, improved sentence structure and organization, refined technical explanations, and strengthened discussion of robustness, execution control, and computational efficiency. Updated related work, evaluation methodology, discussion, limitations, and conclusion sections for improved clarity and consistency.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public.

Paper Type: Full-length papers (i.e. case studies, theoretical, applied research papers). 8 pages

Reroute: true

Submission Number: 53

Loading