Additional Submission Instructions: For the camera-ready version, please include the author names and affiliations, funding disclosures, and acknowledgements.
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: Protein Conformational Control; Protein Frustration; Protein Folding; AlphaFold; MSA
Abstract: Protein structure predictors are sensitive to their multiple sequence alignment (MSA) input, making MSA subsampling a viable strategy for recovering alternative conformations. Existing approaches such as AF-Cluster operate in sequence space, which supports broad exploration but provides limited control over which conformational basin is targeted. We introduce **SF-Cluster**, a framework that uses predicted frustration patterns to guide state-directed conformational sampling, with a coverage-aware refinement step to prevent collapse toward dominant states. On fold-switching benchmarks, SF-Cluster improves targeted recovery of alternative conformations over sequence-space baselines, and effective subsets reflect protein-specific frustration geometries rather than global sequence diversity. When the input MSA is structurally single-basin, no frustration-based strategy recovers non-reference conformations, revealing that such subsampling is a focusing mechanism rather than a discovery engine. These results establish a complementary view of MSA subsampling: sequence-space clustering for broad exploration, frustration-pattern sampling for targeted focusing.
Submission Number: 197
Loading