Learning Task Decomposition with Ordered Memory Policy Network

Yuchen Lu; Yikang Shen; Siyuan Zhou; Aaron Courville; Joshua B. Tenenbaum; Chuang Gan

Learning Task Decomposition with Ordered Memory Policy Network

Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

Published: 12 Jan 2021, Last Modified: 22 Jun 2025ICLR 2021 PosterReaders: Everyone

Keywords: Task Segmentation, Hierarchical Imitation Learning, Network Inductive Bias

Abstract: Many complex real-world tasks are composed of several levels of subtasks. Humans leverage these hierarchical structures to accelerate the learning process and achieve better generalization. In this work, we study the inductive bias and propose Ordered Memory Policy Network (OMPN) to discover subtask hierarchy by learning from demonstration. The discovered subtask hierarchy could be used to perform task decomposition, recovering the subtask boundaries in an unstructured demonstration. Experiments on Craft and Dial demonstrate that our model can achieve higher task decomposition performance under both unsupervised and weakly supervised settings, comparing with strong baselines. OMPN can also be directly applied to partially observable environments and still achieve higher task decomposition performance. Our visualization further confirms that the subtask hierarchy can emerge in our model 1.

One-sentence Summary: We introduce an Ordered Memory Policy Network (OMPN) to discover task decomposition by imitation learning from demonstration.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/learning-task-decomposition-with-ordered/code)

26 Replies

Loading