Abstract: Efficient learning of a data analysis task strongly depends on the data representation.
Most methods rely on (symmetric) similarity or dissimilarity representations by means
of metric inner products or distances, providing easy access to powerful mathematical
formalisms like kernel or branch-and-bound approaches. Similarities and dissimilarities
are however often naturally obtained by non-metric proximity measures which can not
easily be handled by classical learning algorithms. In the last years major efforts have
been undertaken to provide approaches which can either directly be used for such data
or to make standard methods available for these type of data. We provide a comprehensive survey for the field of learning with non-metric proximities. First we introduce the
formalism used in non-metric spaces and motivate specific treatments for non-metric
proximity data. Secondly we provide a systematization of the various approaches. For
each category of approaches we provide a comparative discussion of the individual algorithms and address complexity issues and generalization properties. In a summarizing chapter we provide a larger experimental study for the majority of the algorithms on
standard datasets. We also address the problem of large scale proximity learning which
is often overlooked in this context and of major importance to make the method relevant in practice. The algorithms discussed in this paper are in general applicable
for proximity based clustering, one-class classification, classification, regression or
embedding approaches. In the experimental part we focus on classification tasks.
0 Replies
Loading