Abstract: Learning to compare two objects are essential in applications, especially when labeled data are scarce and imbalanced. As these applications can involve humans and make high-stake decisions, it is critical to explain the learned models. We aim to study post-hoc explanations of Siamese networks (SN) widely used in learning to compare. We characterize the instability of gradient-based explanations due to the additional compared object in SN, in contrast to architectures with a single input instance. We optimize for global invariance based on unlabeled data using self-learning to promote the stability of local explanations for individual input. The invariance leads to constrained optimization problems that can be solved using gradient descent-ascent (GDA), or KL-divergence regularized unconstrained optimization solved by SGD. We provide convergence proofs when the objective functions are nonconvex due to the Siamese architecture. Results on tabular and graph data from neuroscience and chemical engineering show that our local explanations robustly respects the self-learned invariance while optimizing the explanation faithfulness and simplicity. We further demonstrate the convergence of GDA experimentally.
0 Replies
Loading