State-Space Models for Tabular Prior-Data Fitted Networks

Published: 09 Jun 2025, Last Modified: 28 Jun 2025FMSD @ ICML 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Prior-Data Fitted Network, Tabular Data, State-Space Models
TL;DR: This paper presents the bidirectional Mamba approach Hydra as an alternative to the Transformer in TabPFN to enable larger tables for In-Context Learning.
Abstract:

Recent advancements in foundation models for tabular data, such as TabPFN, demonstrated that pretrained Transformer architectures can approximate Bayesian inference with high predictive performance. However, Transformers suffer from quadratic complexity with respect to sequence length, motivating the exploration of more efficient sequence models. In this work, we investigate the potential of using Hydra, a bidirectional linear-time structured state space model (SSM), as an alternative to Transformers in TabPFN. A key challenge lies in SSM’s inherent sensitivity to the order of input tokens – an undesirable property for tabular datasets where the row order is semantically meaningless. We investigate to what extent a bidirectional approach can preserve efficiency and enable symmetric context aggregation. Our experiments show that this approach reduces the order-dependence, achieving predictive performance competitive to the original TabPFN model.

Submission Number: 61
Loading