Sparse Subspace Clustering with Missing and Corrupted DataDownload PDFOpen Website

2018 (modified: 13 Nov 2024)DSW 2018Readers: Everyone
Abstract: In many settings, we can accurately model high-dimensional data as lying in a union of subspaces. Subspace clustering is the process of inferring the subspaces and determining which point belongs to each subspace. In this paper we study a robust variant of sparse subspace clustering (SSC) [1]. While SSC is well-understood when there is little or no noise, less is known about SSC under significant noise or missing entries. We establish clustering guarantees in the presence of corrupted or missing entries. We give explicit bounds on the amount of additive noise and the number of missing entries the algorithm can tolerate, both in deterministic settings and in a random generative model. Our analysis shows that this method can tolerate up to O(n/d) missing entries per column instead of O(n/d <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) as previous analyses show, where we have d-dimensional subspaces in an n-dimensional ambient space. Moreover, our method and analysis work by simply filling in the missing entries with zeros and do not need to know the location of missing entries.
0 Replies

Loading