[Short] Few-Shot Cross-Table Data Mixture in Tabular In-Context Learning: Benefits, Failure Modes, and Alignment

Jia-Wei Liao; Kuan-Yu Chen; Yu-Chen Den; Tien-Hao Chang

[Short] Few-Shot Cross-Table Data Mixture in Tabular In-Context Learning: Benefits, Failure Modes, and Alignment

Jia-Wei Liao, Kuan-Yu Chen, Yu-Chen Den, Tien-Hao Chang

Published: 02 Mar 2026, Last Modified: 02 Apr 2026ICLR 2026 Workshop DATA-FMEveryoneRevisionsCC BY 4.0

Keywords: Tabular foundation models

Abstract: Tabular foundation models show promise for structured data prediction, but unlike text and images, tabular datasets exhibit heterogeneous schemas and label semantics. This raises a critical question: Does mixing tables during few-shot training improve in-context learning (ICL)? We systematically investigate cross-table training under controlled few-shot protocols, comparing single-table training versus augmentation with auxiliary datasets. We identify severe negative transfer under naive mixing and propose two alignment strategies: feature-level matching via optimal transport (OT) and label semantics alignment via pseudo-labeling. Our key finding reveals an architectural divide: TabPFN-v2 and MITRA fail to benefit from cross-table augmentation, while representation-based models (TabICL) achieve +1.02% average improvement. This indicates that cross-table learning requires learned embedding spaces where semantic correspondences can be preserved across heterogeneous schemas.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 118

Loading