Keywords: Reinforcement Learning, Interpretable Machine Learning, Markov Decision Processes
TL;DR: We show that the cost of interpretability for regionwise constant maps can be bounded by the function summarization cost of the final policy. We run simulations to check how the total error depends on the size and dimension of the final policy.
Abstract: For machine learning algorithms to be applicable in human-centric fields such as healthcare, law-making, or industrial design, it is essential to develop interpretable techniques. It is of utmost importance that the physician/government lawmaker/factory worker understand why an algorithm gave the answer that it did. In this paper, we define interpretable policies as being regionwise constant maps. Previous work has computed the optimal policy which takes actions on a partitioned state space. Our work is the first that aims to compute the regions as well as derive the optimal action to take on the partition. We compute upper bounds on the cost of interpretability as the error in summarizing the final policy by a region-wise constant map. We see that this is given by the function summarization error for the final policy. We run experiments to check how our approach performs with the dimension and size of the state space. We compute the optimal interpretable policy for different final policies.
Submission Number: 11
Loading