Hallucinations in AlphaFold 3 for Intrinsically Disordered Proteins with disorder in Biological Process Residues
Keywords: Alphafold3, IDR, Hallucination
TL;DR: Limitations of AlphaFold3 in modeling IDRs, the need for refined hallucination metrics beyond pLDDT, and the importance of integrating experimental disorder data to improve prediction reliability
Abstract: Protein structure prediction has advanced significantly with the introduction of
AlphaFold3, a diffusion-based model capable of predicting complex biomolecu
lar interactions across proteins, nucleic acids, small molecules, and ions. While
AlphaFold3 demonstrates high accuracy in folded proteins, its performance on in
trinsically disordered proteins (IDPs)—which comprise 30–40 percent of the human
proteome and play critical roles in transcription, signaling, and disease—remains
less explored. This study evaluated AlphaFold3’s predictions of IDPs with a fo
cus on intrinsically disordered regions (IDRs) using 72 proteins curated from the
DisProt database. Predictions were generated across multiple random seeds and
ensemble outputs, and residue-level pLDDT scores were compared with exper
imental disorder annotations. Our analysis reveals that 32 percent of residues
are misaligned with DisProt, with 22 percent representing hallucinations where
AlphaFold3 incorrectly predicts order in disordered regions or vice versa. Addi
tionally, 10 percent of residues exhibited context-driven misalignment, suggesting
that AlphaFold3 implicitly incorporates stable structural assumptions. Importantly,
18 percent of residues associated with biological processes showed hallucinations,
raising concerns about downstream implications in drug discovery and disease
research. These findings highlight the limitations of AlphaFold3 in modeling IDRs,
the need for refined hallucination metrics beyond the pLDDT, and the importance
of integrating experimental disorder data to improve the prediction reliability.
Submission Number: 55
Loading