A Large Scale Clustering Scheme for Kernel K-Means

Rong Zhang; Alexander I. Rudnicky

A Large Scale Clustering Scheme for Kernel K-Means

Rong Zhang, Alexander I. Rudnicky

Published: 01 Jan 2002, Last Modified: 28 Jan 2025ICPR (4) 2002EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Kernel functions can be viewed as a non-linear transformation that increases the separability of the input data by mapping them to a new high dimensional space. The incorporation of kernel functions enables the K-Means algorithm to explore the inherent data pattern in the new space. However, the previous applications of the kernel K-Means algorithm are confined to small corpora due to its expensive computation and storage cost. To overcome these obstacles, we propose a new clustering scheme which changes the clustering order from the sequence of samples to the sequence of kernels, and employs a disk-based strategy to control data. The new clustering scheme has been demonstrated to be very efficient for a large corpus by our experiments on handwritten digits recognition, in which more than 90% of the running time was saved.

Loading