Learning Robust Goal Space with Hypothetical Analogy-Making

Shinyoung Joo; Sang Wan Lee

Learning Robust Goal Space with Hypothetical Analogy-Making

Shinyoung Joo, Sang Wan Lee

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: GAN, information regularization, neuroscience-inspired AI, generalizable policy

Abstract: Learning compact state representations from high dimensional and noisy observations is the cornerstone of reinforcement learning (RL). However, these representations are often biased toward the current goal context and overfitted to goal-irrelevant features, making it hard to generalize to other tasks. Inspired by the human analogy-making process, we propose a novel representation learning framework called hypothetical analogy-making (HAM) for learning robust goal space and generalizable policy for RL. It consists of encoding goal-relevant and other task-related features, hypothetical observation generation with different feature combination, and analogy-making between the original and hypothetical observations using discriminators. Our model introduces an analogy-making objective that maximizes the mutual information between the generated hypothetical observation and the original observation to enhance disentangled representation. Experiments on various challenging RL environments showed that our model helps the RL agent’s learned policy generalize by revealing a robust goal space.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

TL;DR: Generating hypothetical observation and maximizing mutual information between the original observation using analogy-making module helps the RL agent’s learned policy generalize by revealing a robust goal context space.

9 Replies

Loading