Extracting Expert's Goals by What-if Interpretable ModelingDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: counterfactuals, explaining decision-making, preference learning, inverse reinforcement learning, healthcare
TL;DR: We recover clinicians' goals of treatments by integrating counterfactual reasoning into batch inverse reinforcement learning and interpretable GAM modeling
Abstract: Although reinforcement learning (RL) has tremendous success in many fields, applying RL to real-world settings such as healthcare is challenging when the reward is hard to specify and no exploration is allowed. In this work, we focus on batch inverse RL (IRL) to recover clinicians' rewards from their past demonstrations of treating patients. We explain their treatments based on the what-if future outcomes: "what future would have happened if a different treatment was taken?", and provide interpretability with generalized additive models (GAMs) - a class of accurate, interpretable models. In both simulation and a real-world hospital dataset, our model outperforms baselines and provide explanations consistent with clinical guidelines, while the commonly-used linear model often contradicts them. We also uncover the unreliability of offline metrics such as matched action accuracy to compare IRL methods which is often used in the literature.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)
5 Replies

Loading