FICO-BENCH: Evaluating Vision-Language Models under Visual Fidelity and Compression at Scale

Jianhong Tu; Nicholas Crispino; Kyle Montgomery; Chenguang Wang; Dawn Song

FICO-BENCH: Evaluating Vision-Language Models under Visual Fidelity and Compression at Scale

Jianhong Tu, Nicholas Crispino, Kyle Montgomery, Chenguang Wang, Dawn Song

Published: 01 Mar 2026, Last Modified: 24 Apr 2026ICLR 2026 AIWILDEveryoneRevisionsCC BY 4.0

Keywords: Vision Language Model; Optical Context Compression; Efficiency; Large Language Model; Long Context

TL;DR: We introduce FiCo-BENCH and show that compression ratio significantly affect performances in the visual text compression setting and current vision language models show varied robustness and proficiency.

Abstract: Visual text compression is an emerging paradigm for rendering text as images for processing by vision-language models (VLMs), enabling higher information density per context token. However, the robustness of VLMs under dense, text-based visual inputs remains unevaluated. We introduce FiCo-BENCH, a benchmark designed to assess VLM robustness across seven controlled variants of visual fidelity and information density. FiCo-BENCH spans documents of 8k to 64k tokens and includes three tasks of increasing semantic granularity: optical character recognition (OCR), needle-in-a-haystack (NIAH) retrieval, and visual question answering (VQA). Evaluating 11 general-purpose VLMs and 3 OCR-specialized models reveals three consistent trends: performance drops sharply under increased density or reduced resolution; cross-task transfer between OCR, NIAH, and VQA is limited; and VQA is comparatively robust because low-level details are lost before high-level semantics. By exposing failure modes that remain invisible under conventional VLM evaluations, \method\ establishes a rigorous test-bed for visual text compression.

PDF: pdf

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 166

Loading