Keywords: Explainable AI, Reinforcement learning, Dynamical systems
TL;DR: We use AI-generated simulations to explain AI classifications generated for dynamical systems predictive analytics.
Abstract: In safety-critical domains, such as industrial systems, the lack of explainability in predictive `black-box' machine learning models can hinder trust and adoption. Standard explainability techniques, while powerful, often require deep expertise in data analytics and machine learning and fail to align with the sequential, dynamic nature of data in these environments. In this paper, we propose a novel explainability framework that leverages reinforcement learning (RL) to support model predictions with visual explanations based on dynamical system simulation. By training RL agents to simulate events that require prediction, we use these agents' critics to make classifications. Next, we employ the actors of the RL agents to simulate the potential future trajectories underlying these classifications, providing visual explanations that are more intuitive and align with the expertise of industrial domain experts. We demonstrate the applicability of this method through a case study involving monitoring a small industrial system for cyberattacks, showing how our framework generates actionable predictions that are supported with visual explanations. This approach aims to bridge the gap between advanced machine learning models and their real-world deployment in safety-critical environments.
Primary Area: interpretability and explainable AI
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13467
Loading