Abstract: Cloud-based AI systems offer significant benefits but also introduce vulnerabilities, making deep neural network (DNN) models susceptible to malicious tampering. This tampering may involve harmful behavior injection or resource reduction, compromising model integrity and performance. To detect model tampering, hard-label fingerprinting techniques generate sensitive samples to probe and reveal tampering. Existing fingerprinting methods are mainly based on gradient-defined sensitivity decision boundary, with the latter showing a manifest superior detection performance. However, all existing fingerprinting methods either suffer from insufficient sensitivity or incur high computational costs. In this paper, we theoretically analyze the black-box co-optimal tampering detection sensitivity of fingerprint samples in the context of decision boundary and gradient-defined sensitivity. Based on this, we further propose Steep-Decision-Boundary Fingerprinting (SDBF), a novel lightweight approach for hard-label tampering detection that inherently and efficiently combines the strengths of existing fingerprinting techniques. SDBF places fingerprint samples near the steep decision boundary, where the outputs of samples are inherently highly sensitive to tampering. We also design a Max Boundary Coverage Strategy (MBCS), which enhances samples' diversity over the decision boundary. Theoretical analysis and extensive experimental results show that SDBF outperforms existing SOTA hard-label fingerprinting methods in both sensitivity and efficiency.
External IDs:dblp:conf/cvpr/BaiL0Z0Y25
Loading