Exploration of autoregressive models for in-context learning on tabular data

Published: 10 Oct 2024, Last Modified: 18 Oct 2024TRL @ NeurIPS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: In-context learning, TabPFN, Mamba, Tabular data
TL;DR: We compared the performance of different auto-regressive architectures in the framework of TabPFN/prior-fitted networks finding that Mamba does not perform as well as Transformer models
Abstract: We explore different auto-regressive model architectures for in-context learning on tabular datasets trained in a similar manner to TabPFN. Namely, we compare transformer based models with a structured state-space model architecture (Mamba) and a hybrid architecture (Jamba), mixing transformer and Mamba layers. We find that auto-regressive transformer models perform similarly to the original TabPFN transformer architectures, albeit at the cost of a doubled context length. Mamba performs worse than similar sized transformer models, while hybrid models show promise in harnessing some advantages of state-space models such as supporting long input context length and fast inference.
Submission Number: 44
Loading