% Making choices is a fundamental aspect of the human experience as well as of other biological and artificial systems. Among the various kinds of choices, the similarity choice task consists of choosing from a choice-set the item that is most similar to a target item. Most models that can predict and explain similarity choices assume independence of irrelevant alternatives (IIA), a simplifying property that allows them to learn from data more easily. While empirical studies indicating the violation of IIA for general choice models abound, this is not the case for similarity choices. This work provides a rigorous treatment of IIA compliance in terms of hypothesis tests in both classical and Bayesian formulations. Results on synthetic data indicate the proposed methodology rejects the null hypothesis (IIA) under sufficient noise. A carefully handcrafted and a randomized dataset with similarity questions over the same universe of items were created for a survey study with real participants. The hypothesis tests on both datasets confirm IIA violations. In addition, fitting our Bayesian perturbation model to that data quantifies the degree of departure from IIA. Last, results on testing for  population homogeneity support that survey participants were homogeneous, indicating that interactions within the choice-sets are likely to have lead to IIA violations. The findings in this work motivate the development of richer similarity choice models that can better predict and explain real data. 


Similarity choice data occur when humans make choices among alternatives based on their similarity to a target, \emph{e.g.}, in the context of information retrieval and in embedding learning settings. Classical metric-based models of similarity choice assume independence of irrelevant alternatives (IIA), a property that allows for a simpler formulation. While IIA violations have been detected in many discrete choice settings, the similarity choice setting has received scant attention. This is because the target-dependent nature of the choice complicates IIA testing. We propose two statistical methods to test for IIA: a classical goodness-of-fit test and a Bayesian counterpart based on the framework of Posterior Predictive Checks (PPC). This Bayesian approach, our main technical contribution, quantifies the degree of IIA violation beyond its mere significance. We curate two datasets: one with choice sets designed to elicit IIA violations, and another with randomly generated choice sets from the same item universe. Our tests confirmed significant IIA violations on both datasets, and notably, we find a comparable degree of violation between them. Further, we devise a new PPC test for population homogeneity. Results show that the population is indeed homogenous, suggesting that the IIA violations are driven by context effects---specifically, interactions within the choice sets. These results highlight the need for new similarity choice models that account for such context effects.