Keywords: Protein-Ligand Co-Folding, Kahneman Tversky Optimization, Reinforcement Learning, Structure Prediction
Abstract: Protein–ligand co-folding has emerged as a powerful alternative for modeling protein-ligand complex, offering inherent flexibility and removing reliance on experimentally determined crystal structures. Recent AlphaFold3-style conditional diffusion models achieve state-of-the-art accuracy on docking benchmarks but lack mechanisms to encode physical principles and expert experience. We propose to utilize Kahneman–Tversky Optimization (KTO), a reinforcement learning method that directly integrates human and biochemical preference signals, for conditional diffusion–based co-folding models. AF3-KTO seamlessly aligns with binary docking feedback and the iterative, conditional architecture of AlphaFold3‐style models, eliminating the need for a separate reward network and minimizing computational overhead. Extensive evaluations on multiple benchmarks show that KTO consistently enhances binding-pose accuracy and physical plausibility, even under imbalanced preference data.
Submission Number: 18
Loading