KORIE: A Multi-Task Benchmark for Detection, OCR, and Information Extraction on Korean Retail Receipts
Abstract: We introduce KORIE, a curated benchmark of 748 Korean retail receipts designed to evaluate scene text detection, Optical Character Recognition (OCR), and Information Extraction (IE) under challenging digitization conditions. Unlike existing large-scale repositories, KORIE consists exclusively of receipts digitized via flatbed scanning (HP LaserJet MFP), specifically selected to preserve complex thermal printing artifacts such as ink fading, banding, and mechanical creases. We establish rigorous baselines across three tasks: (1) Detection, comparing Weakly Supervised Object Localization (WSOL) against state-of-the-art fully supervised models (YOLOv9, YOLOv10, YOLOv11, and DINO-DETR); (2) OCR, benchmarking Tesseract, EasyOCR, PaddleOCR, and a custom Attention-based BiGRU; and (3) Information Extraction, evaluating the zero-shot capabilities of Large Language Models (Llama-3, Qwen-2.5) on structured field parsing. Our results identify YOLOv11 as the optimal detector for dense receipt layouts and demonstrate that while PaddleOCR achieves the lowest Character Error Rate (15.84%), standard LLMs struggle in zero-shot settings due to domain mismatch with noisy Korean receipt text, particularly for price-related fields (F1 scores ≈ 25%). We release the dataset, splits, and evaluation code to facilitate reproducible research on degraded Hangul document understanding.
External IDs:doi:10.3390/math14010187
Loading