Interpretable Low-Dimensional Regression via Data-Adaptive Smoothing

Wesley Tansey, Jesse Thomason, James G. Scott

Jun 16, 2017 (modified: Jun 19, 2017) ICML 2017 WHI Submission readers: everyone
  • Abstract: We consider the problem of estimating a regression function in the common situation where the number of features is small, where interpretability of the model is a high priority, and where simple linear or additive models fail to provide adequate performance. To address this problem, we present GapTV, an approach that is conceptually related both to CART and to the more recent CRISP algorithm, a state-of-the-art alternative method for interpretable nonlinear regression. GapTV divides the feature space into blocks of constant value and fits the value of all blocks jointly via a convex optimization routine. Our method is fully data-adaptive, in that it incorporates highly robust routines for tuning all hyperparameters automatically. We compare our approach against CART and CRISP via both a complexity-accuracy tradeoff metric and a human study, demonstrating that that GapTV is a more powerful and interpretable method.
  • TL;DR: The gap statistic can be used in combination with total variation denoising to perform accurate and interpretable smoothing for low-dimensional regression problems.
  • Authorids: tansey@cs.utexas.edu, jesse@cs.utexas.edu, james.scott@mccombs.utexas.edu
  • Keywords: interpretability, smoothing, convex regression

Loading