Assessing the Degree of Feature Interactions that Determine a Model Prediction

Published: 01 Jan 2024, Last Modified: 06 Feb 2025ICSTW 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Machine Learning (ML) models rely on capturing important feature interactions to generate predictions. This study is focused on validating the hypothesis that model predictions often depend on interactions involving only a few features. This hypothesis is inspired by t-way combinatorial testing for software systems. In our study, we utilize the notion of Shapley Additive Explanations (SHAP) values to quantify each feature’s contribution to model prediction. We then use a greedy approach to identify a minimal subset of features (t) required to determine a model prediction. Our empirical evaluation is performed on three datasets: Adult Income, Mushroom, and Breast Cancer, and three classification models: Logistic Regression, XGBoost, and SVM. Through our experiments, we find that the majority of predictions are determined by interactions involving only a subset of features.
Loading