Why Is This an Outlier? Explaining Outliers by Submodular Optimization of Marginal DistributionsDownload PDF

Published: 26 Jul 2022, Last Modified: 17 May 2023TPM 2022Readers: Everyone
Keywords: Outlier Detection, Submodular Optimization, Marginal Probabilities
Abstract: Detecting outliers is an important task in machine learning, since if left unchecked they could hinder performance of our models. We focus on finding the reason an instance is an outlier, i.e. by finding the subset of features that if ignored the rest of the input is not an outlier anymore. We formulate the problem as a constrained monotonic submodular optimization task thanks to key properties of marginal distributions. Additionally, we leverage probabilistic circuits, which enable tractable marginal queries for arbitrary subsets, to further speed up the subset selection algorithm. We showcase the ability of finding the outlier features in a variety of different corruption scenarios, and show that finding and fixing the outlier features can help in downstream tasks such as classification.
1 Reply

Loading