Keywords: Conceptual Metaphor Theory, Metaphorical Image Generation, Structured Prompting
Abstract: Metaphorical text encodes cross-domain meaning beyond its literal surface, posing a challenge for text-to-image models to produce semantically faithful visual metaphors. We propose CMIG, a structured prompting framework inspired by Conceptual Metaphor Theory (CMT), which decomposes metaphors into source--target mappings and selects visual realization strategies in a reproducible reasoning workflow. Experiments on DALL·E 3, Imagen 2, and FLUX-1 show that CMIG consistently improves semantic alignment and human-rated metaphor quality over prior prompting baselines. We additionally release a 3,500-instance visual metaphor benchmark to support unified evaluation.
Paper Type: Long
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: Multimodality and Language Grounding to Vision, Robotics and Beyond, Semantics: Lexical and Sentence-Level, Resources and Evaluation
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: English
Submission Number: 289
Loading