AegisGuard: RL-Guided Adapter Tuning for TEE-Based Efficient & Secure On-Device Inference

CHE WANG; Ziqi Zhang; Yinggui Wang; Tiantong Wang; Yurong Hao; Jianbo Gao; Tao Wei; YANG CAO; Zhong Chen; Wei Yang Bryan Lim

AegisGuard: RL-Guided Adapter Tuning for TEE-Based Efficient & Secure On-Device Inference

CHE WANG, Ziqi Zhang, Yinggui Wang, Tiantong Wang, Yurong Hao, Jianbo Gao, Tao Wei, YANG CAO, Zhong Chen, Wei Yang Bryan Lim

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0

Keywords: Trusted Execution Environment, TEE, TEE-Shielded DNN Partition, Model Stealing Attack, Split inference

TL;DR: We propose a noval fine tuning framework that uses reinforcement learning and adapter compression to selectively shield model layers in TEEs, which achieves strong model protection with 2–3× efficiency gains and no accuracy loss.

Abstract: On-device large models (LMs) reduce cloud dependency but expose proprietary model weights to the end-user, making them vulnerable to white-box model stealing (MS) attacks. A common defense is TEE-Shielded DNN Partition (TSDP), which places all trainable LoRA adapters (fine tuned on private data) inside a trusted execution environment (TEE). However, this design suffers from excessive host-to-TEE communication latency. We propose AegisGuard, a fine tuning and deployment framework that selectively shields the MS sensitive adapters while offloading the rest to the GPU, balancing security and efficiency. AegisGuard integrates two key components: i) RL-based Sensitivity Measurement (RSM), which injects Gaussian noise during training and applies a lightweight reinforcement learning to rank adapters based on their impact on model stealing; and (ii) Shielded-Adapter Compression (SAC), which structurally prunes the selected adapters to reduce both parameter size and intermediate feature maps, further lowering TEE computation and data transfer costs. Extensive experiments demonstrate that AegisGuard achieves black-box level MS resilience (surrogate accuracy around 39%, matching fully shielded baselines), while reducing end-to-end inference latency by 2–3× and cutting TEE memory usage by 4× compared to state-of-the-art TSDP methods.

Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)

Submission Number: 7854

Loading