Abstract: Tabular data is widely utilized in various machine learning tasks. Current tabular learning research predominantly focuses on closed environments, while in real-world applications, open environments are often encountered, where distribution and feature shifts occur, leading to significant degradation in model performance. Previous research has primarily concentrated on mitigating distribution shifts, whereas feature shifts, a distinctive and unexplored challenge of tabular data, have garnered limited attention. To this end, this paper conducts the first comprehensive study on feature shifts in tabular data and introduces the first **tab**ular **f**eature-**s**hift **bench**mark (TabFSBench). TabFSBench evaluates impacts of four distinct feature-shift scenarios on four tabular model categories across various datasets and assesses the performance of large language models (LLMs) and tabular LLMs in the tabular benchmark for the first time. Our study demonstrates three main observations: (1) most tabular models have the limited applicability in feature-shift scenarios; (2) the shifted feature set importance has a linear relationship with model performance degradation; (3) model performance in closed environments correlates with feature-shift performance. Future research direction is also explored for each observation.
Benchmark: [LAMDASZ-ML/TabFSBench](https://github.com/LAMDASZ-ML/TabFSBench).
Lay Summary: Tabular data is widely used in machine learning, but current research mainly focuses on closed environments. In real-world applications, however, data often comes from open environments where distribution shifts and feature shifts occur, significantly degrading model performance. While previous work has primarily addressed distribution shifts, feature shifts—a unique and understudied challenge in tabular data—have received little attention.
This paper presents the first comprehensive study on feature shifts in tabular data and introduces **TabFSBench**, the first benchmark for evaluating feature shifts in tabular learning. TabFSBench assesses the impact of four different feature-shift scenarios across multiple datasets and model categories, including large language models (LLMs) and tabular-specific LLMs, marking their first evaluation in this setting.
Key findings include:
- Most tabular models struggle with feature shifts.
- The importance of shifted features has a linear relationship with performance degradation.
- Model performance in closed environments correlates with feature-shift robustness.
Based on these insights, we outline future research directions. This work provides new perspectives on handling feature shifts in tabular data.
Link To Code: https://github.com/LAMDASZ-ML/TabFSBench
Primary Area: Deep Learning->Robustness
Keywords: tabular data, feature shift, open environment
Submission Number: 8834
Loading