Accelerating Privacy-Preserving Machine Learning With GeniBatch

Xinyang Huang, Junxue Zhang, Xiaodian Cheng, Hong Zhang, Yilun Jin, Shuihai Hu, Han Tian, Kai Chen

Published: 01 Jan 2024, Last Modified: 07 Aug 2024EuroSys 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Cross-silo privacy-preserving machine learning (PPML) adopt; Partial Homomorphic Encryption (PHE) for secure data combination and high-quality model training across multiple organizations (e.g., medical and financial). However, PHE introduces significant computation and communication overheads due to data inflation. Batch optimization is an encouraging direction to mitigate the problem by compressing multiple data into a single ciphertext. While promising, it is impractical for a large number of cross-silo PPML applications due to the limited vector operations support and severe data corruption. In this paper, we present GeniBatch, a batch compiler that translates a PPML program with PHE into an efficient program with batch optimization. GeniBatch adopts a set of conversion rules to allow PHE programs involving all vector operations required in cross-silo PPML and ensures end-to-end result consistency before/after compiling. By proposing bit-reserving algorithms, GeniBatch avoids bit-overflow for the correctness of compiled programs and maximizes the compression ratio. We have integrated GeniBatch into FATE, a representative cross-silo PPML framework, and provided SIMD APIs to harness hardware acceleration. Experiments across six popular applications show that GeniBatch achieves up to 22.6× speedup and reduces network traffic by 5.4×-23.8× for generic cross-silo PPML applications.