Explaining Machine Learning Models Based on Conditional Expected Prediction

Explaining Machine Learning Models Based on Conditional Expected Prediction

ICLR 2026 Conference Submission17719 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: XAI; model-agnostic method; EEG;

Abstract: Complex machine learning models are increasingly used across various fields, but gaining insight into their decision-making processes remains a challenge. Numerous explanation methods have been developed in recent years, aiming to clarify how these models work from different perspectives. Recent studies have shown that some of these explanation methods may produce misleading results, particularly when features are correlated, such as with background noise or correlated features lacking relevant information related to the target. Among different methods, those based on conditional expected prediction have demonstrated greater robustness to such features. However, applying these methods requires knowledge of the conditional distribution, i.e., the distribution conditioned on a specific feature, which is challenging to estimate. Current approximation methods require additional assumptions about the data and models. We propose a global model-agnostic explanation method based on conditional expected prediction. Our method approximates conditional expected predictions through data partitioning and kernel-based methods, eliminating the need for additional assumptions. We validate our method using synthetic data and open-source EEG data, and the results demonstrate that it is significantly less affected by correlated features.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 17719

Loading