Generalization of Protein Foundation Models for Engineered Fluorescent Biosensors

Published: 28 May 2026, Last Modified: 28 May 2026GenBio 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: proteins, foundation models, biosensors, generalization
TL;DR: Evaluating the generalization capabilities of protein foundation models in sequence and fitness space for biosensor engineering
Abstract: Protein foundation models (PFMs) are increasingly used in fitness prediction tasks in which engineers seek to identify sequences with improved function. However, these models are often evaluated under random splits, which may not reflect the generalization required for engineering superior variants. Further, results from existing classical protein fitness benchmarks may not generalize to fluorescent protein biosensors which require designing finely tuned protein dynamics. We benchmark embeddings from seven PFMs across four regression heads on a dataset of 1,314 mutated GCaMP variants, evaluating each model–head pair under a random baseline and two extrapolation regimes: novel-region splits that hold out protein sequence regions, and low-to-high fitness splits that hold out the highest-fitness variants. Extrapolative splits result in substantial performance drops across all models, and structure- and MSA-based conditioning partially mitigate this drop on novel-region splits but not on low-to-high splits. The two extrapolation regimes also differ in ranking transfer: novel-region rankings closely track random-split rankings while low-to-high rankings diverge. These findings indicate that model and architecture choices should be driven by which form of generalization the application requires, with no single configuration optimal across regimes.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 127
Loading