Abstract: In hierarchical reinforcement learning, human expertise is often involved in defining sub-goals that decompose the final objective into relevant sub-tasks. However, existing approaches with human-defined sub-goals often lack crucial information about their correlations, limiting their applicability in environments with multiple parallel tasks or mutually conflicting solutions. To address this issue, we propose a mixed-initiative Bayesian sub-goal optimization algorithm that combines human expertise with AI automated reasoning to identify reasonable sub-goals. Our algorithm employs a probabilistic graphical model to capture the correlations among the candidate sub-goals and refine the encoded knowledge to reduce the introduced biases. We conduct experiments in high-dimensional environments with both discrete and continuous controls. In comparison with relevant baselines, our algorithm can achieve better performance in effectively solving problems with multiple selectable solutions. We have empirically demonstrated that our approach is robust against varying levels of human knowledge and expertise, consistently converging to optimal hierarchical policies even amidst misleading or conflicting human guidance.
Loading