Abstract: Most existing models for abstract visual reasoning perform poorly in compositional visual reasoning (CVR), due to complex nature of compositional rules and difficulties in distinguishing tiny rule differences between outliers and normal images. To tackle the challenges, we propose a Dual-Branch Compositional Reasoning (DBCR) model, exploiting both intra-cluster relations among the cluster of normal images and extra-cluster relations between normal images and outliers. Specifically, we design one branch of Intra-Cluster Regression Reasoning Blocks (ICR2Bs) to encapsulate common relations among normal images through hierarchical regressing reasoning, and the other branch of Contrastive Attention Reasoning Blocks (CARBs) to exploit extra-cluster differences between normal images and outliers through self-attention. Simultaneously minimizing the regression errors in ICR2Bs and maximizing the extra-cluster differences in CARBs help identify the correct cluster of normal images. Experimental results on two CVR datasets show that the proposed DBCR consistently outperforms state-of-the-art models. The code is available at https://github.com/He1mont/DBCR.
External IDs:dblp:conf/icassp/LiSRBZ025
Loading