Discovering Logic-Informed Intrinsic Rewards to Explain Human Policies

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Logic rule, policy planning, reward engineering
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: In high-stakes systems like healthcare, it is essential to distill high-level strategic knowledge from top clinicians’ demonstrations. This paper aims to extract knowledge-driven reward functions from experts’ demonstrations, representing the knowledge as a set of logic rules. Our learning framework is built upon the classic inverse reinforcement learning (IRL), assuming that the experts, like clinicians, are rational and their executed treatments are the optimal planning results via maximizing their logic-informed utility function. Our algorithm can automatically extract these logic rules from demonstrations. Specifically, we formulate reward engineering as a backward reasoning procedure, where a rule generator is trained to sequentially generate predicates starting from the goal and then considering conditions and evidence. We interpret policy planning as a forward reasoning procedure, where the optimal policy is obtained by finding the best path to forward chaining the given rules. This sequential optimization process involves refining the policy function, Q-function, and reward function, ultimately leading to the discovery of the most effective strategic rules. In our experiments, we demonstrate the superior performance of our method in discovering meaningful logic rules within the context of a healthcare problem.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2185
Loading