SAP: Privacy-Preserving Fine-Tuning on Language Models with Split-and-Privatize Framework

Published: 2025, Last Modified: 15 Jan 2026IJCAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Pre-trained Language Models (PLM) have enabled a cost-effective approach to handling various downstream applications via Parameter-Efficient-Fine-Tuning (PEFT) techniques. In this context, service providers have introduced a popular fine-tuning-based product service known as Model-as-a-Service (MaaS). This service offers users access to extensive PLMs and training resources. With MaaS, users can fine-tune, deploy, and utilize their customized models seamlessly, leveraging a one-stop platform that allows them to work with their private datasets efficiently. However, this service paradigm has recently been exposed to the possibility of leaking user private data. To this end, we identify the data privacy leakage risks in MaaS-based PEFT and propose a Split-and-Privatize (SAP) framework, mitigating the privacy leakage by integrating split learning and differential privacy into MaaS PEFT. Furthermore, we propose Contributing-Token-Identification (CTI), a novel method to balance model utility degradation and privacy leakage. As a result, the proposed framework is comprehensively evaluated, demonstrating a 65% improvement in empirical privacy with only a 1% degradation in model performance on the Stanford Sentiment Treebank dataset, outperforming existing state-of-the-art baselines.
Loading