Exploiting Geometry for Support Vector Machine IndexingOpen Website

2005 (modified: 05 Nov 2022)SDM 2005Readers: Everyone
Abstract: Support Vector Machines (SVMs) have been adopted by many data-mining and information-retrieval applications for learning a mining or query concept, and then retrieving the “top-k” best matches to the concept. However, when the dataset is large, naively scanning the entire dataset to find the top matches is not scalable. In this work, we propose a kernel indexing strategy to substantially prune the search space and thus improve the performance of top-k queries. Our kernel indexer (KDX) takes advantage of the underlying geometric properties and quickly converges on an approximate set of top-k instances of interest. More importantly, once the kernel (e.g., Gaussian kernel) has been selected and the indexer has been constructed, the indexer can work with different kernel-parameter settings (e.g., γ and δ) without performance compromise. Through theoretical analysis, and empirical studies on a wide variety of datasets, we demonstrate KDX to be very effective.
0 Replies

Loading