Hierarchical Decision Making with Structured Policies: A Principled Design via Inverse Optimization

Hierarchical Decision Making with Structured Policies: A Principled Design via Inverse Optimization

ICLR 2026 Conference Submission18287 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: reinforcement learning, optimization, hierarchical policy

Abstract: Hierarchical decision-making frameworks are pivotal for addressing complex control tasks enabling agents to decompose intricate problems into manageable subgoals. Despite their promise, existing hierarchical policies face critical limitations: (i) reinforcement learning (RL)-based methods struggle to guarantee strict constraint satisfaction, and (ii) optimization-based approaches often rely on myopic and computationally prohibitive formulations. In this work, we propose a bi-level reinforcement learning and optimization framework that systematically integrates high-level goal abstraction with structured lower-level decision making. We adopt an inverse optimization approach to inform the structure of the lower-level problem from expert demonstrations, ensuring that the objective of lower-level policy remains aligned with the overall long-term task goal. To validate the approach, our framework is evaluated on three real-world scenarios, where it outperforms baseline methods in both efficiency and decision quality, demonstrating the benefits of learning structured optimization policies within a hierarchical RL architecture.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 18287

Loading