RaBEL: Scale-Aware Radial-Basis Embeddings for Tabular Foundation Models

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: EmbTabular data, Foundation models
Abstract: Recent tabular foundation models routinely match or surpass strong tree ensembles and specialized deep architectures, yet their numeric embeddings remain a bottleneck. We diagnose a low-rank collapse induced by the prevalent linear+ID scheme and introduce RaBEL, a compact Radial Basis Embedding Layer that front-loads nonlinearity via localized RBF features. RaBEL increases shallow-layer effective rank and improves conditioning without deeper stacks; it is complementary to periodic mappings. We further identify a permutation-order pathology in bidirectional attention (feature$\rightarrow$sample) and propose a reordered stack: sample-attention $\rightarrow$ FFN $\rightarrow$ feature-attention, ensuring column-level context precedes feature mixing and that all attention computations influence the readout. Combining both ideas yields MiniX, a 2M-parameter model that surpasses 7M-parameter TabPFN-v2 and 27M-parameter TabICL baselines on popular benchmarks while reducing training and inference cost. Our results highlight principled nonlinear embeddings and attention-order redesign as key enablers of accuracy and efficiency gains in tabular foundation models.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 5392
Loading