Submission Type: Short paper (4 pages)
Keywords: tabular foundation models, feature engineering, task-agnostic embedding, outlier detection, classification, regression
TL;DR: Task-agnostic representation learning performance of simple feature engineering is comparable to resource-intensive tabular foundation models.
Abstract: Recent foundation models for tabular data achieve strong task-specific performance via in-context learning. Nevertheless, they focus on direct prediction by encapsulating both representation learning and task-specific inference inside a single, resource-intensive network. This work specifically focuses on representation learning, i.e., on transferable, task-agnostic embeddings. We systematically evaluate task-agnostic representations from tabular foundation models (TabPFN and TabICL) alongside with classical feature engineering (TableVectorizer) across a variety of application tasks as outlier detection (ADBench) and supervised learning (TabArena Lite). We find that simple TableVectorizer features achieve comparable or superior performance while being up to three orders of magnitude faster than tabular foundation models. The code is available at https://github.com/ContactSoftwareAI/TabEmbedBench.
Submission Number: 23
Loading