Keywords: structure prediction, fold-switching protein, sequential sampling, multiple sequence alignment, Markov random field model, contact map.
Abstract: Protein structure prediction has been revolutionized by AlphaFold, yet a key limitation remains:
its inability to characterize the multiple conformations of fold-switching proteins. Current approaches to address this limitation within the
AlphaFold framework rely on subsampling the
multiple sequence alignment (MSA) input, either through random sampling or clustering, but
these methods are statistically inefficient and fail
to utilize coevolutionary information between
residues. We introduce SMICE, a sequential sampling framework that systematically explores the
MSA space by incorporating residue-specific frequencies and coevolutionary patterns inferred via
Markov random fields. On a benchmark set of
92 fold-switching proteins, SMICE outperforms
existing methods and substantially improves structural diversity.
Submission Number: 55
Loading