Perceptual-IQ: Visual Commonsense Reasoning about Perceptual Imagination

Published: 01 Jan 2022, Last Modified: 21 May 2025IEEE Big Data 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this paper, we present a new dataset Perceptual Imagination: Question-Answering (Perceptual-IQ) to evaluate the visual systems’ commonsense reasoning ability when confronted with perceptual changes. In our dataset, the machines are given a question that includes a perceptual change over an image and they have to predict human response to the change. Perceptual-IQ consists of 3.7K manually annotated QA pairs from 1.6K curated images and covers various types of perceptual changes. Through the evaluation of vision-language models with Perceptual-IQ, we identify the performance gap (~25%) with human performance.
Loading