Abstract: Robustness against adversarial attacks has recently been at the forefront of algorithmic design for
machine learning tasks. In the adversarial streaming model, an adversary gives an algorithm a sequence
of adaptively chosen updates $u_1, \ldots , u_n$ as a data stream. The goal of the algorithm is to compute or
approximate some predetermined function for every prefix of the adversarial stream, but the adversary
may generate future updates based on previous outputs of the algorithm. In particular, the adversary may
gradually learn the random bits internally used by an algorithm to manipulate dependencies in the input.
This is especially problematic as many important problems in the streaming model require randomized
algorithms, as they are known to not admit any deterministic algorithms that use sublinear space. In
this paper, we introduce adversarially robust streaming algorithms for central machine learning and
algorithmic tasks, such as regression and clustering, as well as their more general counterparts, subspace
embedding, low-rank approximation, and coreset construction. For regression and other numerical linear
algebra related tasks, we consider the row arrival streaming model. Our results are based on a simple, but
powerful, observation that many importance sampling-based algorithms give rise to adversarial robustness
which is in contrast to sketching based algorithms, which are very prevalent in the streaming literature but
suffer from adversarial attacks. In addition, we show that the well-known merge and reduce paradigm in
streaming is adversarially robust. Since the merge and reduce paradigm allows coreset constructions in the
streaming setting, we thus obtain robust algorithms for k-means, k-median, k-center, Bregman clustering,
projective clustering, principal component analysis (PCA) and non-negative matrix factorization. To the
best of our knowledge, these are the first adversarially robust results for these problems yet require no
new algorithmic implementations. Finally, we empirically confirm the robustness of our algorithms on
various adversarial attacks and demonstrate that by contrast, some common existing algorithms are not
robust.
0 Replies
Loading