From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection

Moritz Vandenhirtz; Julia E Vogt

From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection

Moritz Vandenhirtz, Julia E Vogt

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We propose an inherently interpretable method whose prediction bases on a learned masked version of the input image, designed to align with human perception.

Abstract: Understanding the decision-making process of machine learning models provides valuable insights into the task, the data, and the reasons behind a model's failures. In this work, we propose a method that performs inherently interpretable predictions through the instance-wise sparsification of input images. To align the sparsification with human perception, we learn the masking in the space of semantically meaningful pixel regions rather than on pixel-level. Additionally, we introduce an explicit way to dynamically determine the required level of sparsity for each instance. We show empirically on semi-synthetic and natural image datasets that our inherently interpretable classifier produces more meaningful, human-understandable predictions than state-of-the-art benchmarks.

Lay Summary: (1) Often, a user does not know why Machine Learning models make a certain prediction. (2) We develop a method that only uses a small part of an image for its prediction. (3) With this method, a user knows which parts of the image were used to make a prediction.

Link To Code: https://github.com/mvandenhi/P2P

Primary Area: Social Aspects->Accountability, Transparency, and Interpretability

Keywords: instance-wise feature selection, feature selection, interpretability, perception-adhering masking

Submission Number: 11455

Loading