Track: Demo papers (2-4 pages)
Keywords: Agentic VLMs, AI Safety, Model Context Protocol
TL;DR: SafeAgent-MCP enforces proactive safety for GUI automation agents by combining semantic PII blocking and structured action constraints before execution.
Abstract: Mobile and desktop task automation using agentic vision-language models (VLMs) faces critical safety challenges: processing sensitive UI screenshots containing personally identifiable information (PII), generating potentially harmful GUI actions, and operating autonomously without human oversight. Current approaches like VisionTasker rely on post-execution validation or lack generation-time safety guarantees entirely. We present SafeAgent-MCP, a framework combining semantic-constrained decoding with context-aware entity detection for generation-time safety and Model Context Protocol (MCP) for dynamic policy management. Our system enforces three constraint types: (1) Entity-level blocking using context- aware semantic entity recognition to prevent PII leakage from screenshots, (2) Action-space constraints via NVIDIA NIM structured generation restricting dangerous GUI operations, and (3) Policy-aware refusal leveraging OpenAI’s gpt-oss- safeguard for reasoning-based safety validation. SafeAgent-MCP provides the first systematic safety framework for production GUI automation agents combining modular semantic entity recognition with industry-standard constrained generation infrastructure.
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 20
Loading