FeVisQA: Free-Form Question Answering over Data Visualizations

Published: 2025, Last Modified: 15 Oct 2025ICDE 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Given a massive dataset, data visualization (DV) could efficiently express the insights and summaries behind the massive raw data by employing vivid visual representations. To create suitable DVs, users are required to get a comprehensive understanding of the raw data and then transfer their ideas into DVs by composing a suitable and accurate specification in some declarative visualization languages (DVLs, e.g., Vega-Lite). A specification is a JSON object defining the properties of the DVs, like the selected data, the transformations, the visual details, and so on. Due to its complicated grammar and details, DV has quite a steep learning curve, even for data analysts. In this paper, we propose a new task named FeVisQA, referring to Free-form Question Answering over data Visualizations. More specifically,-given a raw dataset, a related DV (in the form of a specification), and a question, FeVisQA aims to predict a textual answer automatically. As a particular case of the general CodeQA (i.e., QA over general programming code like Python and Java) task, FeVisQA enables people to better comprehend data and its DVs by conducting logical reasoning when answering these questions. Since FeVisQA has not been studied in the literature, we first construct a benchmark dataset containing 152 datasets, 14,406 DVs, and 83,890 QA pairs. To tackle this new task, we design a novel neural network named FeVisQANet with advanced multi-modal encoder and adaptive decoder structures, and we also design a novel multi-step framework called VisQA for Multi-modal Large Language Models (MLLMs) based on Retrieval-augmented Generation (RAG) technology. Extensive experiments on our constructed datasets validate the rationale and effectiveness of this proposed FeVisQA task and the proposed model. While research on QA over text and table, machine reading comprehension, and CodeQA develops rapidly, prior works have yet to draw attention to question-answering over DVs. This study connects two important subareas, QA from the natural language process area and DV from the data engineering area. We hope this new dataset and model can serve as a helpful benchmark that would benefit the development of both fields.
Loading