Generalizing Subgoals from Single Instances using Hypothesis-Preserving Ensembles

Generalizing Subgoals from Single Instances using Hypothesis-Preserving Ensembles

ICLR 2026 Conference Submission13360 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: hierarchical reinforcement learning, reinforcement learning

TL;DR: Hypothesis-preserving ensembles enable generalization of subgoals learned from a single task to new tasks.

Abstract: A reinforcement learning agent trained on a single source subgoal has no way to determine during training which features will be relevant for future instances of that same subgoal. This creates ambiguity: multiple plausible models of a subgoal can fit the training data but not all will successfully transfer. Humans address this ambiguity by maintaining alternative hypotheses until new information reveals the most effective one. Drawing inspiration from this, we introduce a hypothesis-preserving ensemble in which each member is a distinct, plausible subgoal classifier trained on the same source task. The agent then tests these alternative hypotheses in a new task, learning policies for the corresponding subtasks and uses task reward to select the most effective classifier. Experiments on Montezuma's Revenge and MiniGrid DoorMultiKey show that our method recovers subgoals learned in the source task, successfully adapting them to visually different tasks.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 13360

Loading