Steering Sequence Generation in Protein Language Models through Iterative Lookback Monte Carlo Sampling
Keywords: Computational Biology, Protein Language Models, Controllable Sequence Design, Monte Carlo, Maximum Entropy Sampling
TL;DR: This paper introduces Iterative Lookback Monte Carlo (ILMC), a training-free sampling strategy for protein language models that combines autoregressive generation with Monte Carlo refinement to steer sequence generation toward desired properties.
Abstract: Protein language models (pLMs) leverage large-scale evolutionary data to generate novel sequences, but steering generation toward desired physicochemical properties without sacrificing diversity remains a major challenge. Existing approaches often induce severe diversity loss or require computationally expensive retraining. We introduce \textbf{Iterative Lookback Monte Carlo} (ILMC), a training-free inference-time sampling strategy that interleaves autoregressive elongation with Metropolis--Hastings refinement to approximate sampling from a maximum-entropy target distribution balancing generative quality and steering objectives. We show theoretically that this target distribution is entropy-maximizing under fixed generative quality and steering constraints, and empirically that ILMC produces more diverse samples than standard autoregressive baselines at matched generative quality. Using simple steering potentials, ILMC improves desired molecular properties, including generating proteins with up to $12^\circ\mathrm{C}$ higher predicted melting temperature than compute-matched alternative strategies. ILMC naturally applies to classifier-guided steering, where it outperforms purely autoregressive guidance in diversity while maintaining comparable enrichment of target properties. We validate ILMC on family-specific pLMs and on the multi-family model ProGen3.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 25
Loading