Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer

Yulun Wu; Doron L Bergman

Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer

Yulun Wu, Doron L Bergman

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We present an Adversarially Pre-trained Transformer (APT) that is able to perform zero-shot meta-learning on tabular prediction tasks without using any real-world dataset to pre-train the model, extending on the recent development of Prior-Data Fitted Networks (PFNs) and TabPFN. Specifically, APT is pre-trained with adversarial synthetic data agents, who continue to shift their underlying data generating distribution and deliberately challenge the model with different synthetic datasets. In addition, we propose a mixture block model architecture that is able to handle classification tasks with arbitrary number of classes, addressing the class size limitation -- a crucial weakness of prior tabular zero-shot learning algorithms. In experiments, we show that our framework matches state-of-the-art performance on small tabular classification tasks without filtering on dataset characteristics such as number of classes and number of missing values, while maintaining an average runtime under one second. On common benchmark dataset suites in both classification and regression, we show that adversarial pre-training was able to enhance TabPFN's performance. In our analysis, we demonstrate that the adversarial synthetic data agents were able to generate a more diverse collection of data compared to the ordinary random generator in TabPFN. In addition, we demonstrate that our mixture block neural design has improved generalizability and greatly accelerated pre-training.

Lay Summary: In traditional artificial intelligence (AI), researchers teach an AI model to learn the pattern of specific tasks (e.g. predicting tomorrow's weather, answering someone's questions) by optimizing the AI model on the past data of those tasks. In this work, we teach an AI model how to learn the pattern of unseen tasks on its own (i.e. teach it to learn how to learn), without optimizing it on any past data of these tasks. This is called zero-shot meta-learning. This is a very ambitious goal, and while prior work has made progress on this goal by drawing inspiration from the recent advancements in large language models, it only managed to achieve this on a constrained family of tasks with structured, tabular data. We proposed an improved approach that relieved some of these constraints and yielded improved performance, by establishing an AI adversary to generate difficult tasks for the AI model to learn. This AI adversary is like a competent chess rival to the AI model -- while the model makes moves to improve its ability to solve the tasks that the adversary produces, the adversary makes moves to produce more difficult tasks for the model to solve. By pitting them against each other, both the AI model's problem-solving ability and the AI adversary's problem-creating ability improve over time. Our proposal is independent of most recent advancements in zero-shot meta-learning, and hence does not create conflict when combining those advancements and our advancement. To help other researchers explore these combinations and further improve on this idea, we have released a free and easy-to-use tool called APT, along with the program’s settings.

Link To Code: https://github.com/yulun-rayn/APT

Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning

Keywords: zero-shot, meta-learning, adversarial training, tabular deep learning, bayesian inference, in-context learning

Submission Number: 12677

Loading