\section{Conclusion}\label{sec:conclusion}

We have proposed a framework for robot guidance to a human agent performing hierarchical tasks that incorporates observations of task-related movements and gaze to estimate the agent's intention. We implemented and evaluated the framework in a VR posing task. Comparisons with a human wizard reveals that our framework provides guidance close to human-level performance, in terms of overall usability and timeliness and precision of guidance. Ablation experiments where gaze was removed shows that the high level of performance is a direct result of incorporating gaze information to resolve ambiguities inherent in the hierarchical structure of the task.