Multicollinearity Correction and Combined Feature Effect in Shapley Values

Indranil Basu, Subhadip Maji

Published: 01 Jan 2022, Last Modified: 20 May 2025AI 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Model interpretability is one of the most intriguing problems in most machine learning models, particularly for those that are mathematically sophisticated. Computing Shapley Values are one of the best approaches so far to find the importance of each feature in a model, at the instance (data point) level. In other words, Shapley values represent the importance of a feature for a particular instance or observation, especially for classification or regression problems. One of the well known limitations of Shapley values is that the estimation of Shapley values with the presence of multicollinearity among the features are not accurate as well as reliable. To address this problem, we present a unified framework to calculate accurate Shapley values with correlated features. To be more specific, we do an adjustment (matrix formulation) of the features while calculating independent Shapley values for the instances to make the features independent with each other. Our implementation of this method proves that our method is computationally efficient also, compared to the existing Shapley method.