Compositional Generalization in Neuro-Symbolic Visual Question AnsweringDownload PDF

Published: 16 Jun 2023, Last Modified: 21 Jun 2023IJCAI 2023 Workshop KBCG PosterReaders: Everyone
Keywords: neuro-symbolic, compositional generalization, visual question answering, mathematical reasoning
TL;DR: We create compositional generalization splits for visual question answering and evaluate one neuro-symbolic architecture with improvements and one neural baseline.
Abstract: Compositional generalization is a key challenge in artificial intelligence. This paper investigates compositional generalization capabilities in multimodal mathematical reasoning problems. We introduce compositional generalization splits for CLEVR-Math for reasoning hop- and attribute generalization, testing both systematicity and productivity. We evaluate the NS-VQA architecture and compare it to two neural baselines, ViLT and CLIP. Our results show that none of the models generalize to longer reasoning chains than trained on, while showing similar patterns on fewer hops. For our compositional generalization split, ViLT and the CLIP-based model performs better then NS-VQA on the objects held out during training. However, all models see a significant drop in performance. For length generalization, we propose that explicitly learning recursive definitions can be important for compositional generalization. We discuss how knowledge-based curriculum learning can help future architectures achieve such capabilities.
0 Replies

Loading