Piecewise Linear Parametrization of Policies: Towards Interpretable Deep Reinforcement Learning

Published: 27 Oct 2023, Last Modified: 22 Nov 2023NeurIPS XAIA 2023EveryoneRevisionsBibTeX
TL;DR: We propose a neural policy constrained to express a small number of linear behaviors, and show that it leads to improved interpretability while performing comparably to baselines in several control and navigation tasks.
Abstract: Learning inherently interpretable policies is a central challenge in the path to developing autonomous agents that humans can trust. We argue for the use of policies that are piecewise-linear. We carefully study to what extent they can retain the interpretable properties of linear policies while performing competitively with neural baselines. In particular, we propose the HyperCombinator (HC), a piecewise-linear neural architecture expressing a policy with a controllably small number of sub-policies. Each sub-policy is linear with respect to interpretable features, shedding light on the agent's decision process without needing an additional explanation model. We evaluate HC policies in control and navigation experiments, visualize the improved interpretability of the agent and highlight its trade-off with performance.
Submission Track: Full Paper Track
Application Domain: Robotics
Survey Question 1: We impose simplicity constraints on the policies of deep reinforcement learning to make them as inherently interpretable as possible. The resulting agent can only use linear policies to compute its actions, making its interactions transparent. Our proposed architecture also leads to visualizations that directly reflect the cyclicity of the task inside the decision making process of the agent, leading to a better understanding of its actions.
Survey Question 2: Stakeholders cannot get justifications of the actions taken by usual deep reinforcement learning policies due to their complex architecture. Given that these policies may be applied to real-world settings with transparency and explainability obligations, we wanted to explore how to improve the inherent interpretability of deep RL policies while maintaining a high level of performance.
Survey Question 3: We did not use post-hoc techniques, but rather designed a neural architecture that let us embed inherent interpretability properties into the policy.
Submission Number: 35