Agnostic Label-Only Membership Inference Attack on Two-Tower Neural Networks for Recommendation Systems
Abstract: This paper presents an innovative adaptation of the Agnostic Label-Only Membership Inference Attack (ALOA) specifically designed for two-tower neural network (NN) models used in recommendation systems. Unlike traditional membership inference attacks that focus on categorical outputs, our approach targets models that produce continuous vector embeddings. We propose a comprehensive methodology that employs synthetic datasets, shadow model training, and a suite of perturbation techniques to evaluate model robustness using the Maximum Mean Discrepancy (MMD) metric. Experimental results demonstrate that the attack model achieves exceptionally high accuracy and precision in distinguishing whether data is part of the original training dataset, even without direct access to it. These findings extend the theoretical framework of membership inference attacks to continuous output spaces and highlight vulnerabilities in modern recommendation systems.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: /forum?id=xESNHVXt9H¬eId=6rrLFASkK6
Changes Since Last Submission:
Removed unproper modifications.
Assigned Action Editor: Joonas Jälkö
Submission Number: 4451
Loading