MuralAgent: Enhancing Ancient Mural Outpainting with RAG-Based Texts and Multimodal Integration

Published: 2025, Last Modified: 12 Nov 2025ACM Trans. Multim. Comput. Commun. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the context of the digital age, utilizing cutting-edge technology for the digitization and creative expansion of ancient murals is crucial, aimed at preserving and passing on cultural heritage. Existing image outpainting techniques suffer from a lack of semantic guidance. This article introduces MuralAgent, a multimodal model based on Retrieval-Augmented Generation (RAG) technology. It precisely extracts key information from mural images and integrates it with a constructed ancient texts knowledge base to ensure the cultural and semantic consistency of the expanded images. Moreover, fine-tuning the Stable Diffusion model ensures the fidelity of the generated image styles. Specifically, this study involves constructing an ancient texts knowledge base for accurate matching, designing specific prompts for GPT-4V(ision) to extract key information, and innovatively expanding artworks through Stable Diffusion, providing a novel way for the public to reinterpret ancient murals.
Loading