On the role of question encoder sequence model in robust visual question answering

Gouthaman KV, Anurag Mittal

Published: 2022, Last Modified: 11 Oct 2025Pattern Recognit. 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•The question-encoder sequence model plays a significant role in overfitting the VQA models to the train set language biases and reducing the performance on Out-of-Distribution test sets.•A comprehensive study of existing RNN-based and Transformer-based question-encoders on the Out-of-Distribution performance in VQA.•Proposal of a novel question-encoder GAT-QE for VQA that shows better resilience to language biases and improves the Out-of-Distribution performance even without using additional bias-mitigation approaches.