Visual Question Explainable Reasoning on Hypothesis Agent Interaction with Scene

Published: 01 Jan 2026, Last Modified: 05 Nov 2025Signal Process. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We introduce the explainable interaction visual question-answering task, which requires answering questions that include reasoning about human activity beyond the given image and explaining how to obtain the answers which makes the answer more trustworthy.•We build a VQER task for evaluating human action-related questions outside the image.•We propose a explaining and answering model by fusing heterogeneous information.
Loading