Granger Causal Interaction Skill Chains

Caleb Chuck; Kevin Black; Aditya Arjun; Yuke Zhu; Scott Niekum

Granger Causal Interaction Skill Chains

Caleb Chuck, Kevin Black, Aditya Arjun, Yuke Zhu, Scott Niekum

Published: 16 Mar 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Event Certifications: rl-conference.cc/RLC/2024/Journal_Track

Abstract: Reinforcement Learning (RL) has demonstrated promising results in learning policies for complex tasks, but it often suffers from low sample efficiency and limited transferability. Hierarchical RL (HRL) methods aim to address the difficulty of learning long-horizon tasks by decomposing policies into skills, abstracting states, and reusing skills in new tasks. However, many HRL methods require some initial task success to discover useful skills, which paradoxically may be very unlikely without access to useful skills. On the other hand, reward-free HRL methods often need to learn far too many skills to achieve proper coverage in high-dimensional domains. In contrast, we introduce the Chain of Interaction Skills (COInS) algorithm, which focuses on \textit{controllability} in factored domains to identify a small number of task-agnostic skills that allow for a high degree of control of the factored state. COInS uses learned detectors to identify interactions between state factors and then trains a chain of skills to control each of these factors successively. We evaluate COInS on a robotic pushing task with obstacles—a challenging domain where other RL and HRL methods fall short. We also demonstrate the transferability of skills learned by COInS, using variants of Breakout, a common RL benchmark, and show 2-3x improvement in both sample efficiency and final performance compared to standard RL baselines.

Submission Length: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=rRiwLPHhcK

Changes Since Last Submission: Added Appendix I related to hyperparameters Revised Sections 3,4 for better clarify and precise language Removed instances of HIntS Restructuring of the paper to improve clarity and reduce over-claiming. Significant rewrite of the formalism to improve precision, and editing of figures. Typo checks De-anonymized and acknowledgments Removed negative space

Code: https://github.com/CalCharles/object-options

Supplementary Material: zip

Assigned Action Editor: ~Branislav_Kveton1

Submission Number: 1889

Loading