Robust scene text understanding with OCR token and word alignment for Text-VQA and text-caption

Published: 2026, Last Modified: 28 Feb 2026Pattern Recognit. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading