Sketching Data Distribution by Rotation

Published: 01 Jan 2024, Last Modified: 10 Feb 2025IEEE Trans. Knowl. Data Eng. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Kernel density estimation is a useful method for estimating the probability distribution of data. It is a challenge to achieve efficient kernel density estimation, especially for large-scale and high-dimension stream data. We propose rotation kernel, a novel kernel function for density estimation. The rotation kernel density can be fast estimated by a data structure named Rotation Kernel Density Sketch (RKDS). RKDS is a time- and memory-efficient method for kernel density estimation, even over data streams and distributed systems. RKDS is applicable for estimating density at specific points and also for representing data distribution. We provide theoretical analysis for rotation kernel and RKDS. Furthermore, we apply RKDS to outlier detection, concept drift detection, and personalized federated learning. Experiments show that our method improves time efficiency by up to $3\times 10^{3}$ times compared with baselines. RKDS also provides comparable detecting precision and better delay on outlier detection and concept drift detection tasks.
Loading