Abstract: Principal component analysis (PCA) and spectral clustering are representative methods for extracting and interpreting the inherent structure of data. However, if the output results significantly change upon the addition of new data points, it can lead to several issues such as instability in the downstream task or a lack of trust in the findings. To address these problems, we consider online variants of PCA and spectral clustering, and show that a natural subspace-preserving regularizer provides provable approximation and consistency guarantees. Here, an algorithm is said to have a high consistency if the output change, with respect to an appropriate distance metric, is small when new data points are added. We empirically confirm the superiority of the proposed methods using real-world data.
Code Dataset Promise: Yes
Code Dataset Url: https://github.com/sato9hara/consistent-pca-sc/
Signed Copyright Form: pdf
Format Confirmation: I agree that I have read and followed the formatting instructions for the camera ready version.
Submission Number: 416
Loading