Explainable Visual Question Answering via Hybrid Neural-Logical Reasoning

Jingying Gao, Alan Blair, Maurice Pagnucco

Published: 01 Jan 2024, Last Modified: 15 Apr 2025IJCNN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Logical reasoning is a major attribute of human intelligence. Humans demonstrate remarkable proficiency in combining information from multiple modalities for logical reasoning. This capability mirrors the tasks performed by Visual Question Answering (VQA), which is a complex task that requires an understanding of both visual elements and language. However, existing methods, which often rely on deep neural networks to learn implicit representations from data, lack the capacity to reason complex logical problems and provide interpretable explanations. To address this challenge, we propose a novel approach that combines hybrid neural-symbolic reasoning and explainable AI to enhance the explicit reasoning capabilities of VQA models when addressing complex logic questions. Our approach comprises two main components: the Neural-Logical Reasoning Network (NLRN) with the Comprehensive Logical Composition Attention Mechanism (CLoCA), which trains and applies predefined logical rules within the neural network to effectively reason about complex logical questions in VQA tasks; and an explainable AI module that uses symbolic AI methods to provide explanations of the neural network’s decision-making process in VQA. We evaluate our methodology on LoRA, a complex logical reasoning VQA dataset and other recent VQA datasets, demonstrating state-of-the-art performance while providing interpretable and accurate explanations.