Hallucinations in AlphaFold 3 for Intrinsically Disordered Proteins with disorder in Biological Process Residues

NeurIPS 2025 Workshop CauScien Submission55 Authors

01 Sept 2025 (modified: 18 Oct 2025)Submitted to NeurIPS 2025 Workshop CauScienEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Alphafold3, IDR, Hallucination
TL;DR: Limitations of AlphaFold3 in modeling IDRs, the need for refined hallucination metrics beyond pLDDT, and the importance of integrating experimental disorder data to improve prediction reliability
Abstract: Protein structure prediction has advanced significantly with the introduction of AlphaFold3, a diffusion-based model capable of predicting complex biomolecu lar interactions across proteins, nucleic acids, small molecules, and ions. While AlphaFold3 demonstrates high accuracy in folded proteins, its performance on in trinsically disordered proteins (IDPs)—which comprise 30–40 percent of the human proteome and play critical roles in transcription, signaling, and disease—remains less explored. This study evaluated AlphaFold3’s predictions of IDPs with a fo cus on intrinsically disordered regions (IDRs) using 72 proteins curated from the DisProt database. Predictions were generated across multiple random seeds and ensemble outputs, and residue-level pLDDT scores were compared with exper imental disorder annotations. Our analysis reveals that 32 percent of residues are misaligned with DisProt, with 22 percent representing hallucinations where AlphaFold3 incorrectly predicts order in disordered regions or vice versa. Addi tionally, 10 percent of residues exhibited context-driven misalignment, suggesting that AlphaFold3 implicitly incorporates stable structural assumptions. Importantly, 18 percent of residues associated with biological processes showed hallucinations, raising concerns about downstream implications in drug discovery and disease research. These findings highlight the limitations of AlphaFold3 in modeling IDRs, the need for refined hallucination metrics beyond the pLDDT, and the importance of integrating experimental disorder data to improve the prediction reliability.
Submission Number: 55
Loading