Mind the Data, Measuring the Performance Gap Between Tree Ensembles and Deep Learning on Tabular Data

Axel Karlsson, Tianze Wang, Slawomir Nowaczyk, Sepideh Pashami, Sahar Asadi

Published: 01 Jan 2024, Last Modified: 15 May 2025IDA (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recent machine learning studies on tabular data show that ensembles of decision tree models are more efficient and performant than deep learning models such as Tabular Transformer models. However, as we demonstrate, these studies are limited in scope and do not paint the full picture. In this work, we focus on how two dataset properties, namely dataset size and feature complexity, affect the empirical performance comparison between tree ensembles and Tabular Transformer models. Specifically, we employ a hypothesis-driven approach and identify situations where Tabular Transformer models are expected to outperform tree ensemble models. Through empirical evaluation, we demonstrate that given large enough datasets, deep learning models perform better than tree models. This gets more pronounced when complex feature interactions exist in the given task and dataset, suggesting that one must pay careful attention to dataset properties when selecting a model for tabular data in machine learning – especially in an industrial setting, where larger and larger datasets with less and less carefully engineered features are becoming routinely available.