MULTI-COMPONENT OUTCOME PREDICTION FOR ENTERPRISE ROUTING VIA HIERARCHICAL CREDIT ASSIGNMENT

Published: 02 Mar 2026, Last Modified: 15 Apr 2026ICLR 2026 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: counterfactual outcome prediction, monte carlo tree search, reinforcement learning, hierarchical credit assignment, enterprise routing, case-based reasoning, intelligent routing
TL;DR: GIF-MCTS combines four-component outcome prediction with MCTS planning and hierarchical credit assignment to achieve 72-98% routing accuracy across five enterprise domains without real-world feedback.
Abstract: We achieve 72–98% routing accuracy across five enterprise domains (93.5K items), improving over the best baseline by 4.0–5.1 percentage points (p<0.01, Cohen’s d≥4.8), by combining learned outcome predictors with Monte Carlo Tree Search planning. Our framework, GIF-MCTS, addresses a key gap: enterprise routing decisions—assigning bugs to teams, tickets to agents, complaints to departments—involve delayed feedback (days to weeks), making standard RL impractical. GIF combines four complementary predictors (Case-Based Reasoning, gradient-boosted outcome estimation, human behavior modeling, and edge-case detection) into a unified reward predictor that enables MCTS planning without real-world interaction. We further propose Hierarchical Credit Assignment (HICRA), which amplifies learning signals for high-impact routing decisions by αs ≈ E[τs]/E[τt], yielding 28–40% faster convergence
Submission Number: 26
Loading