Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Interpretable Low-Dimensional Regression via Data-Adaptive Smoothing
Wesley Tansey, Jesse Thomason, James G. Scott
Jun 16, 2017 (modified: Jun 19, 2017)ICML 2017 WHI Submissionreaders: everyone
Abstract:We consider the problem of estimating a regression function in the common situation where the number of features is small, where interpretability of the model is a high priority, and where simple linear or additive models fail to provide adequate performance. To address this problem, we present GapTV, an approach that is conceptually related both to CART and to the more recent CRISP algorithm, a state-of-the-art alternative method for interpretable nonlinear regression. GapTV divides the feature space into blocks of constant value and fits the value of all blocks jointly via a convex optimization routine. Our method is fully data-adaptive, in that it incorporates highly robust routines for tuning all hyperparameters automatically. We compare our approach against CART and CRISP via both a complexity-accuracy tradeoff metric and a human study, demonstrating that that GapTV is a more powerful and interpretable method.
TL;DR:The gap statistic can be used in combination with total variation denoising to perform accurate and interpretable smoothing for low-dimensional regression problems.