Keywords: Differential Privacy, Cross-Attention, Provable Guarantee
Abstract: Cross-attention has become a fundamental module nowadays in many important artificial intelligence applications, e.g.,
retrieval-augmented generation (RAG), system prompt, guided stable diffusion, and many more.
Ensuring cross-attention privacy is crucial and urgently needed because its key and value matrices may contain sensitive information about model providers and their users.
In this work, we design a novel differential privacy (DP) data structure to address the privacy security of cross-attention with a theoretical guarantee.
In detail, let $n$ be the input token length of system prompt/RAG data, $d$ be the feature dimension, $0 < \alpha \le 1$ be the relative error parameter, $R$ be the maximum value of the query and key matrices, $R_w$ be the maximum value of the value matrix, and $r,s,\epsilon_s$ be parameters of polynomial kernel methods.
Then, our data structure requires $\widetilde{O}(ndr^2)$ memory consumption with $\widetilde{O}(nr^2)$ initialization time complexity and $\widetilde{O}(\alpha^{-1} r^2)$ query time complexity for a single token query.
In addition, our data structure can guarantee that the process of answering user query satisfies $(\epsilon, \delta)$-DP with $\widetilde{O}(n^{-1} \epsilon^{-1} \alpha^{-1/2} R^{2s} R_w r^2)$ additive error and $n^{-1} (\alpha + \epsilon_s)$ relative error between our output and the true answer.
Furthermore, our result is robust to adaptive queries in which users can intentionally attack the cross-attention system.
To our knowledge, this is the first work to provide DP for cross-attention and is promising to inspire more privacy algorithm design in large generative models (LGMs).
Submission Number: 83
Loading