ColorConceptBench: A Benchmark for Probabilistic Color-Concept Understanding in Text-to-Image Models

ColorConceptBench: A Benchmark for Probabilistic Color-Concept Understanding in Text-to-Image Models

ACL ARR 2026 January Submission1248 Authors

29 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: text-to-image generation, color semantics, color-concept association

Abstract: While text-to-image (T2I) models have advanced considerably, their capability to associate colors with implicit concepts remains underexplored. To address the gap, we introduce ColorConceptBench, a new human-annotated benchmark to systematically evaluate color-concept associations through the lens of probabilistic color distributions. ColorConceptBench moves beyond explicit color names or codes by probing how models translate 1,281 implicit color concepts using a foundation of 6,369 human annotations. Our evaluation of seven leading T2I models reveals that current models lack sensitivity to abstract semantics, and crucially, this limitation appears resistant to standard interventions (e.g., scaling and guidance). This demonstrates that achieving human-like color semantics requires more than larger models, but demands a fundamental shift in how models learn and represent implicit meaning.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking, evaluation, language resources

Contribution Types: Data resources, Data analysis

Languages Studied: English, Chinese

Submission Number: 1248

Loading