Learning Perceptual Inference by ContrastingDownload PDF

Chi Zhang, Baoxiong Jia, Feng Gao, Yixin Zhu, Hongjing Lu, Song-Chun Zhu

06 Sept 2019 (modified: 05 May 2023)NeurIPS 2019Readers: Everyone
Abstract: "Thinking in pictures", i.e., spatial-temporal reasoning, has been widely believed to be a significant ability for humans to perform logical induction and a crucial factor in the intellectual history of technology development. Modern Artificial Intelligence (AI), fueled by massive datasets, deeper models, and mighty computation, has already come to a stage where (super-)human-level performances are observed in certain specific tasks. However, AI's current ability in "thinking in pictures" is still far lacking behind. In this work, we study how to improve machines' reasoning ability on one challenging task of this kind, i.e., Raven's Progressive Matrices (RPM). Specifically, we propose to borrow the very idea of "contrast effects" from the field of psychology, cognition, and education to design and train a permutation-invariant model. Inspired by cognitive studies, we equip our model with a simple inference module that is jointly trained with the perception backbone. Combining all the elements, we propose the Contrastive Perceptual Inference network (CoPINet) and empirically demonstrate that CoPINet sets the new state-of-the-art for permutation-invariant models on two major datasets.
Code Link: http://wellyzhang.github.io/project/copinet.html
CMT Num: 634
0 Replies

Loading