Explainable Zero-Shot Visual Question Answering via Logic-Based Reasoning

Thomas Eiter; Jan Hadl; Nelson Higuera Ruiz; Lukas Lange; Johannes Oetsch; Bileam Scheuvens; Jannik Strötgen

Explainable Zero-Shot Visual Question Answering via Logic-Based Reasoning

Thomas Eiter, Jan Hadl, Nelson Higuera Ruiz, Lukas Lange, Johannes Oetsch, Bileam Scheuvens, Jannik Strötgen

Published: 20 Apr 2025, Last Modified: 29 Aug 2025NeSy 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: neurosymbolic AI, visual question answering, answer-set programming, interpretability, GQA

TL;DR: A zero-shot neurosymbolic VQA framework using Answer-Set Programming for interpretable reasoning and error analysis.

Track: Neurosymbolic Methods for Trustworthy and Interpretable AI

Abstract: Visual Question Answering (VQA) is the task of answering natural language questions about images, which is a challenge for AI systems. To enhance adaptability and reduce training overhead, we address VQA in a zero-shot setting by leveraging pre-trained neural modules without additional fine-tuning. Our proposed hybrid neurosymbolic framework, whose capabilities are demonstrated on the challenging GQA dataset, integrates neural and symbolic components through logic-based reasoning via Answer-Set Programming. Specifically, our pipeline employs large language models for semantic parsing of input questions, followed by the generation of a scene graph that captures relevant visual content. Interpretable rules then operate on the symbolic representations of both the question and the scene graph to derive an answer. Our framework provides a key advantage: it enables full transparency into the reasoning process. Using an existing explanation tool, we illustrate how our method fosters trust by making decisions interpretable and facilitates error analysis when predictions are incorrect. Beyond explaining its own reasoning, our framework can also explain answers from more opaque models by integrating their answers into our system, enabling broader interpretability in VQA.

Paper Type: Long Paper

Software: https://github.com/pudumagico/nesy25

Submission Number: 34

Loading