Efficient, Few-shot Directed Evolution with Energy Rank Alignment

Published: 02 Mar 2026, Last Modified: 05 Mar 2026GEM 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: directed evolution, protein language model, alignment, post-training, protein engineering
Abstract: Directed evolution is a powerful and widely used technique for protein engineering, and reducing the cost of iterated experimental observations has become a major priority for practitioners. A number of recent efforts to use machine-learning-based predictors to improve sequence selection have led to remarkable improvements in efficiency, but the sparse data at each experimental iteration restricts these approaches to extremely simple models. Adapting large-scale pre-trained protein language models using experimental data offers an alternative that we show productively leverages the strong inductive biases of the natural distribution of protein sequences to navigate high-dimensional, combinatorially large fitness landscapes. Our approach uses a general-purpose "post-training" algorithm grounded in statistical physics that employs quantitative experimental rankings to directly produce a sampler for diverse, high fitness sequences with fewer data points than competing methods.
Presenter: ~Sebastian_Ibarraran1
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: No, the presenting author of this submission does not fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 67
Loading