Regularization Cocktails for Tabular Datasets

Arlind Kadra; Marius Lindauer; Frank Hutter; Josif Grabocka

Regularization Cocktails for Tabular Datasets

Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: deep learning, regularization, hyperparameter optimization, benchmarks.

Abstract: The regularization of prediction models is arguably the most crucial ingredient that allows Machine Learning solutions to generalize well on unseen data. Several types of regularization are popular in the Deep Learning community (e.g., weight decay, drop-out, early stopping, etc.), but so far these are selected on an ad-hoc basis, and there is no systematic study as to how different regularizers should be combined into the best “cocktail”. In this paper, we fill this gap, by considering the cocktails of 13 different regularization methods and framing the question of how to best combine them as a standard hyperparameter optimization problem. We perform a large-scale empirical study on 40 tabular datasets, concluding that, firstly, regularization cocktails substantially outperform individual regularization methods, even if the hyperparameters of the latter are carefully tuned; secondly, the optimal regularization cocktail depends on the dataset; and thirdly, regularization cocktails yield the state-of-the-art in classifying tabular datasets by outperforming Gradient-Boosted Decision Trees.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: An empirical study on the optimal combination of regularization methods.

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=OZDyJjhcdc

16 Replies

Loading