SituationalPriv: A Context-Aware Framework for Privacy Detection and Protection in Vision-Language Models
Keywords: vision-language models, reasoning, privacy
TL;DR: We introduce SituationalPriv, the first benchmark for context-aware privacy detection in VLMs, and propose a training-free LLM–VLM framework that significantly improves real-world privacy protection
Abstract: With the widespread adoption of vision-language models (VLMs), users increasingly transmit large amounts of visual information, making context-aware privacy protection essential. Existing benchmarks for privacy detection are limited: some degrade image quality by blurring sensitive regions, others narrowly target predefined categories, and most overlook the contextual nature of privacy. As a result, current static evaluations fail to capture VLMs’ real-world privacy recognition capabilities.
To address this, we introduce \textbf{SituationalPriv}, a benchmark for evaluating context-aware privacy understanding. It contains 440 high-quality, privacy-relevant images from the DIPA2 dataset, each paired with two distinct usage contexts that assign different privacy attributes to the same content. This design realistically simulates privacy-sensitive scenarios, enabling more comprehensive evaluation.
We further propose a \textbf{training-free framework} that leverages pretrained VLMs and large language models (LLMs) to improve context-aware privacy detection. Unlike prior fine-tuning approaches limited to fixed domains, our method demonstrates strong generalization across open-domain datasets.
Submission Number: 245
Loading