Prompt2Poster: Automatically Artistic Chinese Poster Creation from Prompt Only

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As a critical component in graphic design, artistic posters are widely applied in the advertising and entertainment industry, thus the automatic poster creation from user-provided prompts has become increasingly desired recently. Although existing Text2Image methods create impressive images aligned with given prompts, they fail to generate ideal artistic posters, especially posters with Chinese texts. To create desired artistic Chinese posters including an aligned background, reasonable layouts, and stylized graphical texts from given prompts only, we propose an automatic poster creation framework, named Prompt2Poster. Our framework first utilizes the capacity of the powerful Large Language Model (LLM) to extract user intention from provided prompts and generate the aligned background. For the harmonious layout and graphical text generation, we propose Controllable Layout Generator (CLG) and Graphical Text Generator (GTG) modules that both leverage sufficient multi-modal information, leading to accurate and pleasurable visual results. Comprehensive experiments demonstrate that our Prompt2Poster achieves superior performance especially on text quality and visual harmony than existing poster creation methods. Our codes will be released after the paper review. Our codes will be released after the paper review.
Primary Subject Area: [Generation] Generative Multimedia
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: This work contributes to multimedia/multimodal processing in several significant ways. Firstly, it proposes a novel poster creation framework, Prompt2Poster, that integrates linguistic, visual, and geometrical information. This integration ensures the efficient utilization of diverse modalities, thereby addressing the challenge of data distribution mixing in multimedia content generation. Secondly, the work introduces two innovative modules - the Controllable Layout Generator (CLG) and the Graphical Text Generator (GTG). These modules leverage multimodal information to generate accurate and stylistically diverse graphical texts for Chinese posters. Through these contributions, the work elevates the potential of multimodal processing in enhancing the visual information presentation task of automatic poster generation. The comprehensive experiments further demonstrate the effectiveness of this multimodal approach in creating prompt-guided artistic Chinese posters. Overall, this work pushes the boundaries of multimodal processing in multimedia content generation and offers a promising direction for future research.
Supplementary Material: zip
Submission Number: 4302
Loading