Visual Token Compression Enhances Model Robustness of VLMs

Shishen Gu; Jiequan Cui; Wenbo Hu; Zhenzhen Hu; Zenglin Shi; Richang Hong

Visual Token Compression Enhances Model Robustness of VLMs

Shishen Gu, Jiequan Cui, Wenbo Hu, Zhenzhen Hu, Zenglin Shi, Richang Hong

14 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Token Compression; Robustness

TL;DR: We show for the first time that visual token pruning enhances the robustness of VLMs, mitigating vulnerabilities such as jailbreak attacks and hallucinations.

Abstract: In this paper, we show for the first time that visual token pruning enhances the robustness of Vision-Language Models (VLMs), mitigating vulnerabilities such as jailbreak attacks and hallucinations. Given that vision and language modalities cannot be perfectly aligned, the misaligned visual tokens might act as out-of-distribution (OOD) inputs, leading to unpredictable outputs and introducing potential vulnerabilities. Building on this insight, we aim to enhance model robustness against jailbreaks and hallucinations by selectively reducing visual tokens, while also reducing inference cost as a side benefit. Specifically, we measure the distance between each visual token and the language feature space. Then, visual tokens with large distances are identified as OOD tokens, which can be iteratively pruned. To demonstrate the effectiveness of our method, we evaluate it on seven diverse popular benchmarks. Notably, our method yields an average improvement of 13.46\% in defending jailbreak attacks, consistently achieves competitive performance in mitigating hallucinations, and maintains strong results on general datasets like MME.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 5130

Loading