TabPFN-Wide: Continued Pre-Training for Extreme Feature Counts

Christopher Kolberg; Katharina Eggensperger; Nico Pfeifer

TabPFN-Wide: Continued Pre-Training for Extreme Feature Counts

Christopher Kolberg, Katharina Eggensperger, Nico Pfeifer

02 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: TabPFNv2, Tabular Foundation Models, High Dimensionality Low Sample Size, In-Context Learning, Interpretability

Abstract: Revealing novel insights from the relationship between molecular measurements and pathology remains a very impactful application of machine learning in biomedicine. Data in this domain typically contain only a few observations but thousands of potentially noisy features, posing challenges for conventional machine learning approaches. While prior-data fitted networks emerge as foundation models for tabular data, they are currently not suited to handle large feature counts ($>500$). Although feature reduction enables their application, it hinders feature importance analysis. We propose a strategy that extends existing models through continued pre-training on synthetic data sampled from a customized prior. The resulting model, TabPFN-Wide, matches or exceeds its base model's performance while exhibiting improved robustness to noise. It seamlessly scales beyond $50{,}000$ features, regardless of noise levels, while maintaining inherent interpretability, which is critical for biomedical applications. Our results show that prior-informed adaptation is suitable to enhance the capability of foundation models for high-dimensional data. On real-world biomedical datasets many of the most relevant features identified by the model overlap with previous biological findings, while others propose potential starting points for future studies.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Submission Number: 847

Loading