Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference

TMLR Paper1822 Authors

13 Nov 2023 (modified: 12 Mar 2024)Under review for TMLREveryoneRevisionsBibTeX
Abstract: Multiparty computation approaches to secure neural network inference commonly rely on garbled circuits for securely executing nonlinear activation functions. However, garbled circuits require excessive communication between server and client, impose significant storage overheads, and incur large runtime penalties; for example, securely evaluating ResNet-32 using standard approaches requires more than 300MB of communication, over 10s of runtime, and around 5 GB of preprocessing storage. To reduce these costs, we propose an alternative to garbled circuits: Tabula, an algorithm based on secure lookup tables. Our approach precomputes lookup tables during an offline phase that contains the result of all possible nonlinear function calls. Because these tables incur exponential storage costs in the number of operands and the precision of the input values, we use quantization to reduce these storage costs to make this approach practical. This enables an online phase where securely computing the result of a nonlinear function requires just a single round of communication, with communication cost equal to twice the number of bits of the input to the nonlinear function. In practice our approach costs 2 bytes of communication per nonlinear function call in the online phase. Compared to garbled circuits with quantized inputs, when computing individual nonlinear functions during the online phase, experiments show Tabula uses between $280$-$560 \times$ less communication, is over $100\times$ faster, and uses a comparable amount of storage; compared against other state-of-the-art protocols Tabula achieves greater than $40\times$ communication reduction. This leads to significant performance gains over garbled circuits with quantized inputs during the online phase of secure inference of neural networks: Tabula reduces end-to-end inference communication by up to $9 \times$ and achieves an end-to-end inference speedup of up to $50 \times$, while imposing comparable storage and offline preprocessing costs.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=66v0KwjDgi
Changes Since Last Submission: We have made major revisions to address the final reviewer comments since last submission. The reviewers requested the following major changes: - 'the authors claim malicious security ... It is not clear to me how this alone is enough for malicious security' We have amended the paper to state that our protocol's online phase achieves information theoretic security rather than malicious security -- that is, the protocol's online phase does not leak information unlike FSS schemes which only obtain computational security as they rely on pseudorandom functions. See sections 2.3, and Table 3, for the updated portions regarding this change. - 'The [preprocessing explanation] is rather high-level ... has to be thoroughly analyzed for correctness and security' We have amended the paper and added the preprocessing phase details as a concrete algorithm (see Algorithm 1), and have analyzed both the correctness and security of the algorithm (Section 3.3, "Preprocessing Phase Correctnes", "Preprocessing Phase Security"). The changes specify how the preprocessing phase operates concretely, and provides an analysis of the algorithm to ensure correctness and security. We also made the following updates to improve the clarity of the paper: - Clearly state that 16/32 bit is beyond the capability of our approach due to exponential storage costs; note that 16/32 bit GC results are shown as baseline reference; emphasize that main comparison between 8-bit GC and 8-bit Tabula (see Results, "To ensure fair comparison" paragraph). - Improve clarity of figures through more detailed captions describing the figures (See Figure 1, Figure 4) - Improve clarity of figures by reordering them and placing them closer to sections which refer to them (See Results)
Assigned Action Editor: ~Cho-Jui_Hsieh1
Submission Number: 1822
Loading