Keywords: Reinforcement Learning in Debugging
Abstract: We propose Hierarchical Feedback Interface (HFI) for human-in-the-loop
reinforcement learning in debugging which structures human feedback
grouped into high level objectives and low level refinements to cover the
subjectivity and inefficaciousness of ad-hoc corrections. The HFI employs a
two-tiered policy architecture, in which a high-level policy abstracts
debugging goals into ac a interpretable meta-objectives, and a low-level
policy translates these into actionable feedback thus grounding
human input to the ALigned-and-goal reasoning. The framework integrates a
hierarchical actor-critic mechanism - with the high-level policy
generating goal vectors over reduced state representations, while the
low level policy conditions of both code specific features and these
goals to generate context-aware feedback.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 25378
Loading