Ra-NEM: Faithful model explanations through stochastic feature selection

Ra-NEM: Faithful model explanations through stochastic feature selection

ICLR 2026 Conference Submission16935 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: XAI, perturbation, topk

Abstract: The adoption of machine learning for socially relevant tasks requires effective explainable artificial intelligence (XAI) methods to better understand the behavior of machine learning models. Attribution methods are a popular XAI approach in which input-output relationships are characterized by heat maps that reflect the relative importance of input features for a particular prediction. The quality of such maps is often assessed by measuring faithfulness based on the area under the insertion curve. We propose the first method that directly optimizes this metric to generate attribution heat maps. We establish the connection between insertion curves and top-$k$ feature selection, which leads to a loss function measuring the quality of attributions. Randomization of the loss allows us to efficiently approximate its gradient. We combine the loss function with the neural explanation mask framework to create a new approach for providing accurate attributions efficiently. Experiments demonstrate superior faithfulness along with robust attributions and low inference time, suggesting a new path to generate useful explanations. Code is available at: https://anonymous.4open.science/r/Ra-nem_ICLR-2AD4

Primary Area: interpretability and explainable AI

Submission Number: 16935

Loading