Keywords: privacy, client-side scanning, adversarial attacks
TL;DR: We show that proposed client-side scanning systems are not robust to adversarial attacks aiming to evade detection in a black-box setup.
Abstract: End-to-end encryption (E2EE) in messaging platforms enables people to securely and privately communicate with one another. Its widespread adoption however raised concerns that illegal content might now be shared undetected. Client-side scanning based on perceptual hashing has been recently proposed by governments and researchers to detect illegal content in E2EE communications. We propose the first framework to evaluate the robustness of perceptual hashing-based client-side scanning to detection avoidance attacks and show current systems to not be robust. We propose three adversarial attacks---a general black-box attack and two white-box attacks for discrete cosine transform-based algorithms--against perceptual hashing algorithms. In a large-scale evaluation, we show perceptual hashing-based client-side scanning mechanisms to be highly vulnerable to detection avoidance attacks in a black-box setting, with more than 99.9\% of images successfully attacked while preserving the content of the image. We further show several mitigation strategies, such as expanding the database with hashes of images modified using our attack, or increasing the detection threshold, to be ineffective against our attack. Taken together, our results shed serious doubts on the robustness of perceptual hashing-based client-side scanning mechanisms currently proposed by governments, organizations, and researchers around the world.
Paper Under Submission: The paper is NOT under submission at NeurIPS