Unsupervised Learning of Temporal Abstractions using Slot-based Transformers

Anand Gopalakrishnan; Kazuki Irie; Jürgen Schmidhuber; Sjoerd van Steenkiste

Unsupervised Learning of Temporal Abstractions using Slot-based Transformers

Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste

12 Oct 2021 (modified: 04 May 2025)Deep RL Workshop NeurIPS 2021Readers: Everyone

Keywords: imitation learning, transformers, sub-routine discovery

TL;DR: We propose a novel fully parallel architecture for unsupervised discovery of modular sub-routines using offline RL data.

Abstract: The discovery of reusable sub-routines simplifies decision-making and planning in complex reinforcement learning problems. Previous approaches propose to learn such temporal abstractions in a purely unsupervised fashion through observing state-action trajectories gathered from executing a policy. However, a current limitation is that they process each trajectory in an entirely sequential manner, which prevents them from revising earlier decisions about sub-routine boundary points in light of new incoming information. In this work we propose SloTTAr, a fully parallel approach that integrates sequence processing Transformers with a Slot Attention module for learning about sub-routines in an unsupervised fashion. We demonstrate how SloTTAr is capable of outperforming strong baselines in terms of boundary point discovery, while being up to $30\mathrm{x}$ faster on existing benchmarks.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/unsupervised-learning-of-temporal/code)

0 Replies

Loading