Comet: Communication-Efficient Batch Secure Three-Party Neural Network Inference with Client-Aiding

Published: 01 Jan 2024, Last Modified: 29 Sept 2024ICC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Secure neural network inference enables server (model provider) and client to perform neural network inference without leaking their private inputs. Existing SOTA three-party computation (3PC) inference works emerge challenges on two fronts: i) GPU-accelerated CryptGPU (S&P'21) and P-FALCON (USENIX Security'22) face challenges related to high communication overhead. ii) communication-efficient Meteor(www'23) raises more computation burden and GPU memory usage. These challenges result in lower efficiency when handling large-scale batch inference requests on resource-constrained devices. In this work, we propose Comet,a communication-efficient batch secure three-party inference framework with client-aiding, which achieves semi-honest security in honest majority without collusion between the client and the servers. First, we propose client-aided sharing semantics, which leverages client-generated random values to enhance online communication efficiency. We also design efficient 3PC protocols for neural network operators based on GPU, improving the computational efficiency of both linear and nonlinear layers. Furthermore, we address the tradeoff between communication cost and GPU memory utilization, surpassing SOTA by 1.3-1.9× in communication, 1.5-3.8× in runtime on large-scale batch inference tasks.
Loading