MtArtGPT: A Multi-Task Art Generation System With Pre-Trained Transformer

Cong Jin, Ruolin Zhu, Zixing Zhu, Lu Yang, Min Yang, Jiebo Luo

Published: 01 Aug 2024, Last Modified: 27 Nov 2025IEEE Transactions on Circuits and Systems for Video TechnologyEveryoneRevisionsCC BY-SA 4.0
Abstract: Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception capabilities. Such models have been used as the core assistant for many tasks including art generation. However, high-quality art generation relies heavily on human prompt engineering which is in general uncontrollable. To address these issues, we propose a multi-task AI generated content (AIGC) system for art generation. Specifically, a dense representation manager is designed to process multi-modal user queries and generate dense and applicable prompts to GPT. To enhance artistic sophistication of the whole system, we fine-tune the GPT model by a meticulously collected prompt-art dataset. Furthermore, we introduce artistic benchmarks for evaluating the system based on professional knowledge. Experiments demonstrate the advantages of our proposed MtArtGPT system.
Loading