Tree Structure for the Categorical Wasserstein Weisfeiler-Lehman Graph Kernel

TMLR Paper5395 Authors

16 Jul 2025 (modified: 16 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The Wasserstein Weisfeiler-Lehman~(WWL) graph kernel is a popular and efficient approach, utilized in various kernel-dependent machine learning frameworks for practical applications with graph data. It incorporates optimal transport geometry into the Weisfeiler-Lehman graph kernel, to mitigate the information loss inherent in aggregation strategies of graph kernels. While the WWL graph kernel demonstrates superior performances in many applications, it suffers a drawback in its computational complexity, i.e., at least $\mathcal{O}(n_{1} n_{2})$, where $n_{1}, n_{2}$ denote the number of vertices on input graphs. Consequently, it hinders the practical applicability of WWL graph kernel, especially in large-scale settings. In this paper, we propose the Tree Wasserstein Weisfeiler-Lehman (TWWL) graph kernel, which leverages \emph{tree structure} to scale up the exact computation of the WWL graph kernel for graph data with categorical node labels. In particular, the computational complexity of TWWL graph kernel is $\mathcal{O}(n_{1} + n_{2})$, enabling for its applications with large-scale graphs. Numerical experiments demonstrate that performances of the proposed kernel compare favorably with those baseline kernels, while its computation is several-order faster than the classic WWL graph kernel, paving ways for its applications in large-scale datasets where the WWL kernel is computationally prohibited.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Rémi_Flamary1
Submission Number: 5395
Loading