RankFlow: Property-aware Transport for Protein Optimization

Published: 26 Jan 2026, Last Modified: 26 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: protein language models, fitness prediction
Abstract: A key step in protein optimization is modeling the fitness landscape, which maps proteins to functional assay readouts. Existing methods typically either use property-agnostic likelihoods/embeddings from pretrained protein language models (PLMs) for fitness prediction, or assume independent mutational effects, limiting their ability to capture higher-order interactions. In this work, we introduce RankFlow, a conditional flow framework that refines PLM representations to be a property-aligned distribution via a tailored energy function and captures multi-mutation interactions through learnable embeddings. To align optimization with evaluation protocols, we propose the Rank-Consistent Conditional Flow Loss (RC$^2$), a differentiable ranking objective that enforces the correct order of mutants rather than absolute values, which improves out-of-distribution generalization. Finally, we introduce a Property-guided Steering Gate (PSG) that concentrates learning on positions carrying signals for the target property while suppressing unrelated evolutionary biases. Across the ProteinGym, PEER, and FLIP benchmarks, RankFlow obtains state-of-the-art ranking accuracy and superior generalization performance.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 1203
Loading