Keywords: Deep Neural Networks, Model Extraction Defense
Abstract: Machine Learning as a Service (MLaaS) has become a widely adopted method for delivering deep neural network (DNN) models, allowing users to conveniently access models via APIs. However, such services have been shown to be highly vulnerable to Model Extraction Attacks (MEAs). While numerous defense strategies have been proposed, verifying the ownership of a suspicious model with strict theoretical guarantees remains a challenging task. To address this gap, we introduce CREDIT a certified defense against MEAs. Specifically, we employ mutual information to quantify the similarity between DNN models, propose a practical verification threshold, and provide rigorous theoretical guarantees for ownership verification based on this threshold. We extensively evaluate our approach on several mainstream datasets and achieve state-of-the-art performance. Our implementation is publicly available at: \url{https://anonymous.4open.science/r/CREDIT}.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 12160
Loading