Abstract: Multimodal Large Language Models (MLLMs), Large Vision-Language Models (LVLMs), and Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, including text-based reasoning and multimodal content generation. However, these models frequently generate hallucinations-factually incorrect or misleading content-that pose significant challenges, particularly in high-stakes domains such as healthcare, law, and finance. This tutorial provides a comprehensive exploration of hallucinations in MLLMs, LVLMs, and LLMs, examining their causes, detection methods, and mitigation strategies. We discuss different types of hallucination evaluation and benchmarking and explore state-of-the-art techniques for hallucination mitigation.
External IDs:dblp:conf/mir/JingZD25
Loading