Abstract: Highlights•VQA models often confidently give incorrect answers to irrelevant questions.•We enhance model robustness at test-time through multi-modal semantic augmentation.•Proposed CMA creates varied inputs for models and merges predictions for stability.•CMA variants improve VQA reliability and performance in ambiguous environments.
Loading