Abstract: Visual Creative Description (VCD) generation, which involves crafting imaginative and actionable textual prompts for text-to-image models, is a critical yet underexplored task for large language models (LLMs). We propose the Cognitive Chain-of-Creativity (C-CoC), a novel framework that leverages structured cognitive modeling to enhance the novelty and visual expressiveness of generated descriptions. To support this task, we introduce the PAINT dataset, comprising high-quality VCDs across 33 product categories. Experiments demonstrate that C-CoC significantly improves description creativity by 10\%-19\% compared to baselines. However, our evaluation of LLMs reveals limited alignment with human judgments in assessing VCD quality, highlighting the complexity of creative evaluation. Our contributions lay a foundation for structured creative generation and underscore the need for advancements in LLM-based evaluation.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Resources and Evaluation, NLP Applications
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 6819
Loading