Keywords: capsule networks, face verification, siamse networks, few-shot learning, contrastive loss
TL;DR: A pairwise learned capsule network that performs well on face verification tasks given limited labeled data
Abstract: Capsule Networks have shown encouraging results on \textit{defacto} benchmark computer vision datasets such as MNIST, CIFAR and smallNORB. Although, they are yet to be tested on tasks where (1) the entities detected inherently have more complex internal representations and (2) there are very few instances per class to learn from and (3) where point-wise classification is not suitable. Hence, this paper carries out experiments on face verification in both controlled and uncontrolled settings that together address these points. In doing so we introduce \textit{Siamese Capsule Networks}, a new variant that can be used for pairwise learning tasks. We find that the model improves over baselines in the few-shot learning setting, suggesting that capsule networks are efficient at learning discriminative representations when given few samples.
We find that \textit{Siamese Capsule Networks} perform well against strong baselines on both pairwise learning datasets when trained using a contrastive loss with $\ell_2$-normalized capsule encoded pose features, yielding best results in the few-shot learning setting where image pairs in the test set contain unseen subjects.
2 Replies
Loading