Bandwidth Selection for Gaussian Kernel Ridge Regression via Jacobian Control

Oskar Allerbo; Rebecka Jörnsten

Bandwidth Selection for Gaussian Kernel Ridge Regression via Jacobian Control

Oskar Allerbo, Rebecka Jörnsten

20 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Primary Area: metric learning, kernel learning, and sparse coding

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Kernel Ridge Regression, Bandwidth Selection, Jacobian Regularization

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: Inspired by Jacobian regularization, we propose a closed-form bandwidth selector for kernel ridge regression with the Gaussian kernel, and demonstrate its superior speed and stability compared to cross-validation and marginal likelihood maximization

Abstract: Most machine learning methods require tuning of hyper-parameters. For kernel ridge regression with the Gaussian kernel, the hyper-parameter is the bandwidth. The bandwidth specifies the length scale of the kernel and has to be carefully selected in order to obtain a model with good generalization. The default methods for bandwidth selection are cross-validation and marginal likelihood maximization, which often yield good results, albeit at high computational costs. Furthermore, the estimates provided by these methods tend to have very high variance, especially when training data are scarce. Inspired by Jacobian regularization, we formulate an approximate expression for how the derivatives of the functions inferred by kernel ridge regression with the Gaussian kernel depend on the kernel bandwidth. We then use this expression to propose a closed-form, computationally feather-light, bandwidth selection heuristic, based on controlling the Jacobian. In addition, the Jacobian expression illuminates how the bandwidth selection is a trade-off between the smoothness of the inferred function and the conditioning of the training data kernel matrix. We show on real and synthetic data that compared to cross-validation and marginal likelihood maximization, our method is considerably faster and considerably more stable in terms of bandwidth selection.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Supplementary Material: zip

Submission Number: 2387

Loading