Keywords: counterfactual outcome prediction, monte carlo tree search, reinforcement learning, hierarchical credit assignment, enterprise routing, case-based reasoning, intelligent routing
TL;DR: GIF-MCTS combines four-component outcome prediction with MCTS planning and hierarchical credit assignment to achieve 72-98% routing accuracy across five enterprise domains without real-world feedback.
Abstract: We achieve 72–98% routing accuracy across five enterprise domains (93.5K
items), improving over the best baseline by 4.0–5.1 percentage points (p<0.01,
Cohen’s d≥4.8), by combining learned outcome predictors with Monte Carlo
Tree Search planning. Our framework, GIF-MCTS, addresses a key gap: enterprise routing decisions—assigning bugs to teams, tickets to agents, complaints
to departments—involve delayed feedback (days to weeks), making standard RL
impractical. GIF combines four complementary predictors (Case-Based Reasoning, gradient-boosted outcome estimation, human behavior modeling, and edge-case detection) into a unified reward predictor that enables MCTS planning without real-world interaction. We further propose Hierarchical Credit Assignment
(HICRA), which amplifies learning signals for high-impact routing decisions by
αs ≈ E[τs]/E[τt], yielding 28–40% faster convergence
Submission Number: 26
Loading