Low Dimensional Visual Attributes: An Interpretable Image Encoding

Pengkai Zhu, Ruizhao Zhu, Samarth Mishra, Venkatesh Saligrama

2020 (modified: 05 Nov 2022)ICPR Workshops (3) 2020Readers: Everyone

Abstract: Deep convolutional networks (DCNs) as black-boxes make many computer vision models hard to interpret. In this paper, we present an interpretable encoding for images that represents the objects as a composition of parts and the parts themselves as a mixture of learned prototypes. We found that this representation is well suited for low-label image recognition problems such as few-shot learning (FSL), zero-shot learning (ZSL) and domain adaptation (DA). Our image encoding model with simple task predictors performs favorably against state of the art approaches in each of these tasks. Via crowdsourced results, we also show that this image encoding using parts and prototypes is interpretable to humans and agrees with their visual perception.

0 Replies