Few-shot Protein Fitness Prediction via In-context Learning and Test-time Training

Felix Teufel; Aaron W Kollasch; Yining Huang; Ole Winther; Kevin K Yang; Pascal Notin; Debora Susan Marks

Few-shot Protein Fitness Prediction via In-context Learning and Test-time Training

Felix Teufel, Aaron W Kollasch, Yining Huang, Ole Winther, Kevin K Yang, Pascal Notin, Debora Susan Marks

Published: 24 Sept 2025, Last Modified: 15 Oct 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Track 1: Original Research/Position/Education/Attention Track

Keywords: protein, fitness, fewshot, engineering, low-n, biology, stability, enzyme

TL;DR: A pretrained set transformer enables prediction of protein fitness from low-N observations.

Abstract: Accurately predicting protein fitness with minimal experimental data is a persistent challenge in protein engineering. We introduce PRIMO (PRotein In-context Mutation Oracle), a transformer-based framework that leverages in-context learning and test-time training to adapt rapidly to new proteins and assays without large task-specific datasets. By encoding sequence information, auxiliary zero-shot predictions, and sparse experimental labels from many assays as a unified token set in a pre-training masked-language modeling paradigm, PRIMO learns to prioritize promising variants through a preference-based loss function. Across diverse protein families and properties—including both substitution and indel mutations—PRIMO outperforms zero-shot and fully supervised baselines. This work underscores the power of combining large-scale pre-training with efficient test-time adaptation to tackle challenging protein design tasks where data collection is expensive and label availability is limited.

Submission Number: 216

Loading