Keywords: svd, linear-algebra, truncated
TL;DR: Verifying performance of algorithms for updating the truncated SVD of evolving matrices.
Abstract: Many applications (e.g. LSI and recommender systems) involve matrices that are subject to the periodic addition of rows and/or columns. Unfortunately, the re-calculation of the SVD with each update can be prohibitively expensive. Consequently, algorithms that exploit previously known information are critical. Kalantzis et al. (2021) present once such algorithm based on a projection scheme to calculate the truncated SVD of evolving matrices. Their main claim states that their proposed algorithm outperforms previous state-of-the-art approaches in terms of qualitative accuracy (relative errors and scaled residual norms of the singular triplets) and speed. As part of the ML Reproducibility Challenge, we sought to verify this claim by re-implementing the proposed algorithm from scratch and performing the same experiments in Python. As the original paper did not provide any benchmarks, we chose to also compare the results of the algorithm to those of FrequentDirections, a state-of-the-art matrix sketching and streaming algorithm. We found that our implementation of the original experiments was able to closely match the results of the paper with regards to accuracy and runtime. We also performed some additional experiments to briefly investigate the effects of the numbers of batches and the desired rank-$k$ on the accuracy of the methods. Though our accuracy-related metrics suggest that the proposed algorithm far outperforms FrequentDirections, we are not confident that this result holds. We suspect that the metrics used are incompatible with FrequentDirections due to the singular value thresholding step in the FrequentDirections algorithm. As a result, though we were able to reproduce the results presented in Kalantzis et al. (2021), our verdict on the claim that the proposed algorithm outperforms other state-of-the-art algorithms is inconclusive and indicative of further work needing to be done to develop metrics for fair comparison. Our anonymized GitHub repository can be found at https://anonymous.4open.science/r/truncatedSVD-0162/.
Paper Url: https://icml.cc/virtual/2021/spotlight/10160
Paper Venue: ICML 2021
Supplementary Material: zip