COKE: Core Kernel for More Efficient Approximation of Kernel Weights in Multiple Kernel Clustering

Weixuan Liang; Xinwang Liu; KE LIANG; Jiyuan Liu; En Zhu

COKE: Core Kernel for More Efficient Approximation of Kernel Weights in Multiple Kernel Clustering

Weixuan Liang, Xinwang Liu, KE LIANG, Jiyuan Liu, En Zhu

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Inspired by the well-known coreset in clustering algorithms, we introduce the definition of the core kernel for multiple kernel clustering (MKC) algorithms. The core kernel refers to running MKC algorithms on smaller-scale base kernel matrices to obtain kernel weights similar to those obtained from the original full-scale kernel matrices. Specifically, the core kernel refers to a set of kernel matrices of size $\widetilde{\mathcal{O}}(1/\varepsilon^2)$ that perform MKC algorithms on them can achieve a $(1+\varepsilon)$-approximation for the kernel weights. Subsequently, we can leverage approximated kernel weights to obtain a theoretically guaranteed large-scale extension of MKC algorithms. In this paper, we propose a core kernel construction method based on singular value decomposition and prove that it satisfies the definition of the core kernel for three mainstream MKC algorithms. Finally, we conduct experiments on several benchmark datasets to verify the correctness of theoretical results and the efficiency of the proposed method.

Lay Summary: We found a way to make a type of machine learning, used to group similar pieces of data, much faster and easier to use on large datasets. Normally, these methods need to process a lot of information, which takes time and computer power. Our idea is to use only a small, smartly chosen sample of the data that still gives nearly the same results as using everything. We created a new method to pick out these smaller samples using a well-known math tool, and we proved it works well with several popular techniques. We also tested our method on real-world examples and showed that it’s both accurate and efficient.

Primary Area: General Machine Learning->Clustering

Keywords: Multiple Kernel Clustering, Multi-View Clustering

Submission Number: 892

Loading