Keywords: Naturel Gradients, Density estimation, Sobolev norm regularization, Score based methods, Fisher divergence for hyperparameter tuning, Anomaly detection, High dimensional data, Kernel Density Estimation (KDE)
TL;DR: New approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density.
Abstract: Density estimation is one of the most central problems in statistical learning.
In this paper we introduce a new approach to non-parametric density estimation that is statistically consistent, is provably different from Kernel Density Estimation, makes the inductive bias of the model clear and interpretable, and performs well on a variety of relatively high dimensional problems.
One of the key points of interest in terms of optimization is our use of natural gradients (in Hilbert Spaces). The optimization problem we solve is non-convex, and standard gradient methods do not perform well. However, we show that the problem is convex on a certain positive cone, and natural gradient steps preserve this cone. The standard gradient steps, on the other hand, tend to lose positivity. This is one of the few cases in the literature where the reasons for the practical preference for the natural gradient are clear.
In more detail, our approach is based on regularizing a version of a Sobolev norm of the density, and there are several core components that enable the method. First, while there is no closed analytic form for the associated kernel, we show that one can approximate it using sampling. Second, appropriate initialization and natural gradients are used as discussed above. Finally, while the approach produces unnormalized densities, which prevents the use of cross-validation, we show that one can instead adopt the Fisher Divergence-based Score Matching methods for this task. We evaluate the resulting method on a comprehensive recent tabular anomaly detection benchmark suite which contains more than 15 healthcare and biology-oriented data sets (ADBench), and find that it ranks second best, among more than 15 algorithms.
Submission Number: 36
Loading