Keywords: protein language models, steering, controllability, low resource data
TL;DR: We investigate weight- and activation-based steering methods for protein language models and discover complementary performances on different biochemical traits.
Abstract: Protein language models (PLMs) are powerful tools for protein engineering, but remain difficult to steer toward specific biochemical properties, where small sequence changes can affect stability or function. We adapt two prominent unsupervised editing methods: task arithmetic (TA; specifically, Forgetting via Negation) in weight space and feature editing with a sparse autoencoder (SAE) in activation space. We evaluated their effects on six biochemical properties: net charge at pH 7, hydrophobicity, aromaticity, instability index, molecular weight, and isoelectric point of generations from three PLMs (ESM3, ProGen2-Large and ProLLaMA). Across models, we observe complementary efficacies: TA more effectively controls some properties, while SAE more effectively controls others. Property response patterns show some consistency across models. We suggest that the response pattern of biochemical properties should be considered when steering PLMs.
Submission Number: 30
Loading