On InstaHide, Phase Retrieval, and Sparse Matrix Factorization

Sitan Chen; Xiaoxiao Li; Zhao Song; Danyang Zhuo

On InstaHide, Phase Retrieval, and Sparse Matrix Factorization

Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 PosterReaders: Everyone

Keywords: Distributed learning, InstaHide, phase retrieval, matrix factorization

Abstract: In this work, we examine the security of InstaHide, a scheme recently proposed by \cite{hsla20} for preserving the security of private datasets in the context of distributed learning. To generate a synthetic training example to be shared among the distributed learners, InstaHide takes a convex combination of private feature vectors and randomly flips the sign of each entry of the resulting vector with probability 1/2. A salient question is whether this scheme is secure in any provable sense, perhaps under a plausible complexity-theoretic assumption. The answer to this turns out to be quite subtle and closely related to the average-case complexity of a multi-task, missing-data version of the classic problem of phase retrieval that is interesting in its own right. Motivated by this connection, under the standard distributional assumption that the public/private feature vectors are isotropic Gaussian, we design an algorithm that can actually recover a private vector using only the public vectors and a sequence of synthetic vectors generated by InstaHide.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We examine the security of InstaHide, a recently proposed framework for private distributed learning, through the lens of phase retrieval and give an attack when the underlying datasets are Gaussian.

Supplementary Material: zip

11 Replies

Loading