LLEOT: A Privacy-Enhancing Offsite Tuning Framework via Loss Landscape Elevation

18 Sept 2025 (modified: 29 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Language models, Model privacy, privacy-preserving transfer learning
Abstract: Adapting large language models (LLMs) to domain-specific tasks via fine-tuning is often infeasible: model parameters are protected by intellectual property, while sensitive data cannot be shared due to privacy regulations. Offsite Tuning addresses this by training adapters on emulators of the original model, but current emulators retain substantial inference ability, exposing model capability privacy and risking misuse. We propose Loss Landscape Elevation Offsite Tuning (LLEOT), a framework that secures both data and model capability privacy. Its core component, Loss Landscape Elevation (LLE), enforces a fixed loss margin between emulator and model, which we theoretically show (Theorem 1) simultaneously (i) degrades emulator inference through perplexity amplification and (ii) preserves gradient alignment, ensuring consistent convergence of prompt optimization. Combined with Collaborative Prompt Knowledge Distillation (CPKD), our method enables adapters trained on emulators to transfer effectively to the original model. Extensive experiments on the OpenBookQA, SocialIQA, ARC-Challenge, and WebQuestions datasets confirm LLEOT achieves strong adaptation while mitigating emulator misuse.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 12487
Loading