Search, Edit, and Fold: LLM-Guided MSA Optimization for Protein Conformation Prediction
Keywords: Protein Conformation Prediction, Long-Context reasoning, Multiple Sequence Alignment, Heuristic Search
TL;DR: MSA-Evolver enables LLMs to iteratively edit and optimize MSAs, substantially improving alternative protein conformation prediction under limited folding budgets.
Abstract: Accurately identifying alternative protein conformations remains a fundamental challenge, particularly when functionally relevant states are encoded by sparse evolutionary signals within large multiple sequence alignments (MSAs). In this work, protein conformation prediction is formulated as a combinatorial search problem in MSA space, shifting the focus from structure divergence to evolutionary information discovery. We introduce **MSA-Evolver**, an optimization framework that enables LLM with direct manipulation and iterative exploration of MSAs. With unified action space and feedback-guided multi-step reasoning strategy, our framework efficiently identifies informative sub-MSAs under limited folding budgets and substantially improves the prediction accuracy of alternative conformations, including open-closed, inward-outward, apo-holo, fold-switching, and intrinsically disordered proteins.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 127
Loading