Guarding a Needle in the Haystack: A Real-Time Policy-Following Streaming Video Guardrail

20 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Video Guardrail, Video Content Safety, Streaming Guardrail
TL;DR: StreamGuard, the first real-time, policy-following streaming guardrail and two benchmark datasets: (1) Safe2Shot (2) AdvVideo-Bench
Abstract: With the rapid growth of video generative models, robust guardrails are more critical than ever to ensure both video content safety, which prevents the proliferation of harmful material (e.g., sexual or self-harm), and video generation security, which defends against adversarial attacks on video generation models (e.g., jailbreak prompts or unsafe video injection). While recent multimodal large language model (MLLM) based guardrails have advanced through reasoning and video understanding, they still face significant limitations. In particular, they rely on frame subsampling, which is unreliable for precise long-term monitoring. In addition, they lack support for real-time streaming and incur high overhead due to inefficient token usage. To address these challenges, we propose StreamGuard, the first real-time, policy-following streaming guardrail for long-form videos. To precisely identify unsafe frames hidden in long videos, StreamGuard efficiently inspects the input video in streaming form to localize unsafe content with high precision. To enable real-time streaming, StreamGuard employs an efficient asynchronous inference stack that parallelizes safety analysis across ingested events while simultaneously encoding and detecting incoming frames, achieving fine-grained, frame-level monitoring with low latency. In addition, considering the lack of benchmarks that reflect real-world long-form video risks, we introduce two benchmark datasets: (1) Safe2Shot, with over 4K unsafe videos annotated at the frame level, capturing needle-in-the-haystack cases where harmful content appears in only a few frames; and (2) AdvVideo-Bench, which includes both TV2V and TV2T components targeting the video and text modalities respectively, designed to evaluate guardrail resilience against video-centric multimodal jailbreaks. Extensive experiments show that StreamGuard outperforms state-of-the-art guardrails by 19.7% on both Safe2Shot and AdvVideo-Bench, and by 10.6% across five existing benchmarks, while reducing token and time costs by 23.5%.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 24396
Loading