Playing Atari with Capsule Networks: A systematic comparison of CNN and CapsNets-based agents.

Akash Singh; Kevin Mets; Jose Oramas; Steven Latré

Playing Atari with Capsule Networks: A systematic comparison of CNN and CapsNets-based agents.

Akash Singh, Kevin Mets, Jose Oramas, Steven Latré

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Abstract: In recent years, Capsule Networks (CapsNets) have achieved promising results in tasks in the object recognition task thanks to their invariance characteristics towards pose and lighting. They have been proposed as an alternative to relational insensitive and translation invariant Convolutional Neural Networks (CNN). It has been empirically proven that CapsNets are capable of achieving competitive performance while requiring significantly fewer parameters. This is a desirable characteristic for Deep reinforcement learning which is known to be sample-inefficient during training. In this paper, we conduct a systematic analysis to explore the potential of CapsNets-based agents in the deep reinforcement learning setting. More specifically, we compare the performance of a CNN-based agent with a CapsNets-based agent in a deep Q-network using the Atari suite as the testbed of our analysis. To the best of our knowledge, this work constitutes the first CapsNets based deep reinforcement learning model to learn state-action value functions without the need of task-specific adaptation. Our results show that, in this setting, CapsNets-based architectures require 92% fewer parameters compared to their CNN-based counterparts. Moreover, despite their smaller size, the CapsNets-based agents provide significant boosts in performance (score), ranging between 10% - 77%. This is supported by our empirical results which shows that CapsNets-based agents outperform the CNN-based agent, in a Double-DQN with Prioritized experience replay setting, in eight out of the nine selected environments.

One-sentence Summary: We propose first CapsNets-based architecture in deep reinforcement learning to learn state-action value functions without the need of task-specific adaptation.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=MaZmwU0PX

6 Replies

Loading