OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Jingjing Chang; Yixiao Fang; Peng Xing; Shuhan Wu; Wei Cheng; Rui Wang; Xianfang Zeng; Gang YU; Hai-Bao Chen

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Jingjing Chang, Yixiao Fang, Peng Xing, Shuhan Wu, Wei Cheng, Rui Wang, Xianfang Zeng, Gang YU, Hai-Bao Chen

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY-NC 4.0

Keywords: T2I models, benchmark, evaluation

Abstract: Text-to-image (T2I) models have garnered significant attention for generating high-quality images aligned with text prompts. However, rapid T2I model advancements reveal limitations in early benchmarks, lacking comprehensive evaluations, especially for text rendering and style. Notably, recent state-of-the-art models, with their rich knowledge modeling capabilities, show potential in reasoning-driven image generation, yet existing evaluation systems have not adequately addressed this frontier. To systematically address these gaps, we introduce $\textbf{OneIG-Bench}$, a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models across multiple dimensions, including subject-element alignment, text rendering precision, reasoning-generated content, stylization, and diversity. By structuring the evaluation, this benchmark enables in-depth analysis of model performance, helping researchers and practitioners pinpoint strengths and bottlenecks in the full pipeline of image generation. Our codebase and dataset are now publicly available to facilitate reproducible evaluation studies and cross-model comparisons within the T2I research community.

Croissant File: json

Dataset URL: https://huggingface.co/datasets/OneIG-Bench/OneIG-Benchmark

Code URL: https://github.com/OneIG-Bench/OneIG-Benchmark

Supplementary Material: pdf

Primary Area: Applications of Datasets & Benchmarks for in Creative AI

Submission Number: 1177

Loading