Grounding the Ungrounded: A Spectral-Graph Framework for Quantifying Hallucinations in multimodal LLMs
Keywords: Multimodal large language models, Hallucination detection, Hallucination quantification, Energy-based models, Spectral graph theory, Hypergraph Laplacian, Graph signal processing, Diffusion kernel (heat kernel), Rayleigh–Ritz bounds, KL divergence calibration, Temperature scheduling, Semantic distortion, Cross-modal alignment, RKHS / kernel methods
TL;DR: We propose an energy-based, temperature-controlled spectral hypergraph framework for multimodal LLMs that quantifies hallucinations, yields KL-calibrated Rayleigh–Ritz bounds, and proves diffusion-time decay.
Abstract: Hallucinations in LLMs—especially in multimodal settings—undermine reliability. We present a rigorous information-geometric framework, grounded in diffusion dynamics, to quantify hallucinations in MLLMs where model outputs are embedded via spectral decompositions of multimodal graph Laplacians, and their gaps to a truth manifold define a semantic distortion metric. We derive Courant–Fischer bounds on a temperature-dependent hallucination profile and use RKHS eigenmodes to obtain modality-aware, interpretable measures that track evolution over prompts and time. This reframes hallucination as quantifiable and bounded, providing a principled basis for evaluation and mitigation.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Submission Number: 21788
Loading