Keywords: reinforcement learning, optimization, hierarchical policy
Abstract: Hierarchical decision-making frameworks are pivotal for addressing complex control tasks enabling agents to decompose intricate problems into manageable subgoals. Despite their promise, existing hierarchical policies face critical limitations: (i) reinforcement learning (RL)-based methods struggle to guarantee strict constraint satisfaction, and (ii) optimization-based approaches often rely on myopic and computationally prohibitive formulations. In this work, we propose a bi-level reinforcement learning and optimization framework that systematically integrates high-level goal abstraction with structured lower-level decision making. We adopt an inverse optimization approach to inform the structure of the lower-level problem from expert demonstrations, ensuring that the objective of lower-level policy remains aligned with the overall long-term task goal. To validate the approach, our framework is evaluated on three real-world scenarios, where it outperforms baseline methods in both efficiency and decision quality, demonstrating the benefits of learning structured optimization policies within a hierarchical RL architecture.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 18287
Loading