Fusion is the New Mutation: Bandit-Guided Evolution on Workflow Graphs
Keywords: LLM Agent; Agentic Workflows; Multi-Armed Bandit
Abstract: Agentic workflows, which orchestrate multiple large language model (LLM) calls, tools, and control flow, have become a powerful paradigm for complex reasoning tasks. Recent work formulates automated workflow discovery as a search problem in code space, but most existing methods rely on tree-based search with single-parent mutation, leading to structural isolation and inefficient exploration in highly discrete and compositional spaces. We propose DAGO (Directed Acyclic Graph Optimization), a principled framework that models workflow search as evolution over a directed acyclic graph with multi-parent fusion. DAGO maintains a shared DAG memory that enables lineage merging across search branches, and formulates parent selection as a contextual linear bandit problem. Using pretrained code embeddings and a Linear UCB policy, DAGO efficiently navigates the combinatorial fusion space. An LLM is then employed as a semantic fusion operator to synthesize new workflows by integrating complementary strengths from multiple parents. We evaluate DAGO on six benchmarks covering mathematical reasoning, code generation, and question answering. DAGO achieves state-of-the-art average performance while reducing token consumption. Ablation studies further confirm that multi-parent fusion and bandit-guided selection are key to its effectiveness.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 54
Loading