DP-MIA: DUAL-PHASE MEMBERSHIP INFERENCE ATTACK ACROSS VLMS TRAINING LIFECYCLE

16 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Vision-Language Models, Membership Inference Attack
Abstract: Recent advancements in Vision-Language Models (VLMs) have amplified privacy concerns of training data source attribution, due to their multi-stage training lifecycle and growing deployment via black-box APIs. The SOTA source attribution approach - Membership Inference Attacks (MIAs) primarily focus on coarse-grained binary classification, oversimplifying the complex exposure risks in multi-stage training. Crucially, MIAs fail to differentiate whether data was exposed during pretraining or finetuning, hindering precise accountability tracing in real-world VLM development. To bridge this gap, we introduce DP-MIA (Dual-Phase Membership Inference Attack), a novel framework that uniquely distinguishes across three exposure states: pretrain-member, finetune-member, or non-member. This multi-class formulation captures fine-grained privacy risks across distinct training stages, enabling significantly more precise source attribution. Designing DP-MIA presents two key challenges: limited model access (black-box setting) and subtle memorization signals. We tackle these challenges through three novel strategies: 1) Cosine Similarity Attack (CSA): exploits semantic alignment shifts between phases; 2) RIGEL-based Multi-class Classifier: leverages a new composite metric (RIGEL) integrating generation response time, inference confidence and generation length for enhanced signal detection; 3) Dual-Binary Attack (DBA): decomposes the inference hierarchically into two binary sub-problems. Extensive experiments on LLaVA and Qwen2-VL demonstrate DP-MIA’s effectiveness (88.2% accuracy) significantly outperforming baselines such as CSA and MCA. Our findings expose critical vulnerabilities in VLM training pipelines and provide actionable insights for privacy auditing in black-box scenarios. DP-MIA’s code is available at https://github.com/frozen-jak/Dual-Phase-Mia.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 7131
Loading