Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

Harry Zhao; Safa Alver; Harm van Seijen; Romain Laroche; Doina Precup; Yoshua Bengio

Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

Harry Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio

Published: 10 Oct 2024, Last Modified: 10 Oct 2024Sys2-Reasoning PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: system-2, reasoning, planning, reinforcement learning, attention, consciousness, generalization

TL;DR: We propose Skipper, which automatically decomposes a given task into smaller, more manageable steps using spatial-temporal abstractions

Abstract: Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning framework utilizing spatio-temporal abstractions to generalize better in novel situations. It automatically decomposes the given task into smaller, more manageable subtasks, and thus enables sparse decision-making and focused computation on the relevant parts of the environment. The decomposition relies on the extraction of an abstracted proxy problem represented as a directed graph, in which vertices and edges are learned end-to-end from hindsight. Our theoretical analyses provide performance guarantees under appropriate assumptions and establish where our approach is expected to be helpful. Generalization-focused experiments validate Skipper's significant advantage in zero-shot generalization, compared to some existing state-of-the-art hierarchical planning methods.

Submission Number: 4

Loading