Keywords: Outlier Detection, Marginal Likelihood, Projected Gradient Methods
TL;DR: Optimizing the marginal likelihood for outlier detection with an efficient projected gradient descent method.
Abstract: Accurate outlier detection is not only a necessary preprocessing step, but can itself give important insights into the data. However, especially, for non-linear regression the detection of outliers is non-trivial, and actually ambiguous. We propose a new method that identifies outliers by finding a subset of data points $T$ such that the marginal likelihood of all remaining data points $S$ is maximized. Though the idea is more general, it is particular appealing for Gaussian processes regression, where the marginal likelihood has an analytic solution. While maximizing the marginal likelihood for hyper-parameter optimization is a well established non-convex optimization problem, optimizing the set of data points $S$ is not. Indeed, even a greedy approximation is computationally challenging due to the high cost of evaluating the marginal likelihood. As a remedy, we propose an efficient projected gradient descent method with provable convergence guarantees. Moreover, we also establish the breakdown point when jointly optimizing hyper-parameters and $S$. For various datasets and types of outliers, our experiments demonstrate that the proposed method can improve outlier detection and robustness when compared with several popular alternatives like the student-t likelihood.
Supplementary Material: pdf
Other Supplementary Material: zip