SituationalPriv: A Context-Aware Framework for Privacy Detection and Protection in Vision-Language Models

Zhaotian Weng; Haoxuan Li; Jieyu Zhao

SituationalPriv: A Context-Aware Framework for Privacy Detection and Protection in Vision-Language Models

Zhaotian Weng, Haoxuan Li, Jieyu Zhao

Published: 16 Oct 2025, Last Modified: 10 Nov 2025NeurIPS 2025 ER WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: vision-language models, reasoning, privacy

TL;DR: We introduce SituationalPriv, the first benchmark for context-aware privacy detection in VLMs, and propose a training-free LLM–VLM framework that significantly improves real-world privacy protection

Abstract: With the widespread adoption of vision-language models (VLMs), users increasingly transmit large amounts of visual information, making context-aware privacy protection essential. Existing benchmarks for privacy detection are limited: some degrade image quality by blurring sensitive regions, others narrowly target predefined categories, and most overlook the contextual nature of privacy. As a result, current static evaluations fail to capture VLMs’ real-world privacy recognition capabilities. To address this, we introduce \textbf{SituationalPriv}, a benchmark for evaluating context-aware privacy understanding. It contains 440 high-quality, privacy-relevant images from the DIPA2 dataset, each paired with two distinct usage contexts that assign different privacy attributes to the same content. This design realistically simulates privacy-sensitive scenarios, enabling more comprehensive evaluation. We further propose a \textbf{training-free framework} that leverages pretrained VLMs and large language models (LLMs) to improve context-aware privacy detection. Unlike prior fine-tuning approaches limited to fixed domains, our method demonstrates strong generalization across open-domain datasets.

Submission Number: 245

Loading