TabPalooza: A Benchmark Odyssey for Tabular Model Evaluation

15 Sept 2025 (modified: 21 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: tabular data, deep learning, deep tabular prediction, machine learning
Abstract: Tabular data is fundamental to machine learning, yet a lack of a widely accepted and comprehensive benchmark hinders the reliable evaluation of models, which range from tree-based models and neural network to more recent in-context learning-based approaches. Existing benchmarks are often limited in the diversity of meta-features considered, leading to inconsistent model rankings and reduced generalizability. To address these issues, this study constructs a novel benchmark for tabular data classification and regression, designed with an explicit focus on two key, often competing, characteristics: Diversity and Efficiency. We propose a pipeline to quantitatively assess benchmark diversity and introduce a method for selecting a representative subset of datasets. Our results demonstrate that the proposed benchmark achieves superior diversity compared to existing alternatives while maintaining evaluation efficiency. The main contributions include the new benchmark TabPalooza, the evaluation pipeline, and an empirical validation of the benchmark's enhanced coverage. The proposed TabPalooza is available at https://huggingface.co/datasets/data-hub-xyz987/TabPalooza.
Primary Area: datasets and benchmarks
Submission Number: 5396
Loading