Agnostic Label-Only Membership Inference Attack on Two-Tower Neural Networks for Recommendation Systems

TMLR Paper4451 Authors

11 Mar 2025 (modified: 03 Apr 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: This paper presents an innovative adaptation of the Agnostic Label-Only Membership Inference Attack (ALOA) specifically designed for two-tower neural network (NN) models used in recommendation systems. Unlike traditional membership inference attacks that focus on categorical outputs, our approach targets models that produce continuous vector embeddings. We propose a comprehensive methodology that employs synthetic datasets, shadow model training, and a suite of perturbation techniques to evaluate model robustness using the Maximum Mean Discrepancy (MMD) metric. Experimental results demonstrate that the attack model achieves exceptionally high accuracy and precision in distinguishing whether data is part of the original training dataset, even without direct access to it. These findings extend the theoretical framework of membership inference attacks to continuous output spaces and highlight vulnerabilities in modern recommendation systems.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: /forum?id=xESNHVXt9H¬eId=6rrLFASkK6
Changes Since Last Submission:

Removed unproper modifications.

Assigned Action Editor: Joonas Jälkö
Submission Number: 4451
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview