Keywords: Deepfake, Explainability
Abstract: The rapid advancement of Generative AI has fundamentally reshaped how visual content is produced, circulated, and consumed. Recent diffusion and vision language models (VLMs) can now fabricate highly persuasive, photorealistic deepfakes that extend far beyond identity alterations, enabling manipulations of human actions, intent, object interactions, and scene level semantics. Over the next 3–5 years, these high level, contextually rich fabrications are expected to become increasingly prevalent across social media, news ecosystems, and interactive multimedia platforms.
As deepfakes evolve from simple facial swaps to rich, context aware manipulations driven by large scale generative models, the core challenge facing the multimedia community is no longer detection alone but understanding why an image is identified as manipulated. Traditional detectors often act as black boxes, flagging inconsistencies without offering human interpretable reasoning. This lack of transparency limits trust and hinders deployment in high stakes environments, and makes it difficult for researchers and practitioners to diagnose model failures.
The Deepfake Explainability Challenge addresses this crucial gap by shifting the focus from mere classification to interpretable, evidence driven deepfake understanding. Instead of asking models to simply detect or localise manipulated regions, the challenge requires participants to generate meaningful, human understandable explanations that pinpoint why particular pixels, regions, or semantic attributes indicate manipulation.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 16
Loading