Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Secret Sharing, Privacy, Security, Transformer, Secure multi-party computation
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: To protect user's prompt against LLM servers, we designed an efficient and accurate secure multi-party computation approach.
Abstract: With ChatGPT as a representative, tons of companies have began to provide ser-
vices based on large Transformers models. However, using such a service inevitably
leak users’ prompts to the model provider. Previous studies have studied secure in-
ference for Transformer models using secure multiparty computation (MPC), where
model parameters and clients’ prompts are kept secret. Despite this, these frame-
works are still limited in terms of model performance, efficiency, and deployment.
To address these limitations, we propose framework PUMA to enable fast and secure
Transformer model inference. Our framework designs high quality approximations
for expensive functions such as GeLU and softmax, and significantly reduce the
cost of secure inference while preserving the model performance. Additionally, we
design secure Embedding and LayerNorm procedures that faithfully implement the
desired functionality without undermining the Transformer architecture. PUMA is
about 2× faster than the state-of-the-art framework MPCFORMER(ICLR 2023) and
has similar accuracy as plaintext models without fine-tuning (which the previous
works failed to achieve). PUMA can even evaluate LLaMA-7B in around 5 minutes
to generate 1 token. To our best knowledge, this is the first time that a model with
such a parameter size is able to be evaluated under MPC.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4557
Loading