Learning Realistic Sketching: A Dual-agent Reinforcement Learning Approach

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: This paper presents a pioneering method for teaching computer sketching that transforms input images into sequential, parameterized strokes. However, two challenges are raised for this sketching task: weak stimuli during stroke decomposition and maintaining semantic correctness, stylistic consistency, and detail integrity in the final drawings. To tackle the challenge of weak stimuli, our method incorporates an attention agent, which enhances the algorithm's sensitivity to subtle canvas changes by focusing on smaller, magnified areas. Moreover, in enhancing the perceived quality of drawing outcomes, we integrate a sketching style feature extractor to seamlessly capture semantic information and execute style adaptation, alongside a drawing agent that decomposes strokes under the guidance of the XDoG reward, thereby ensuring the integrity of sketch details. Based on dual intelligent agents, we have constructed an efficient sketching model. Experimental results attest to the superiority of our approach in both visual effects and perceptual metrics when compared to state-of-the-art techniques, confirming its efficacy in achieving realistic sketching.
Primary Subject Area: [Experience] Art and Culture
Secondary Subject Area: [Generation] Generative Multimedia
Relevance To Conference: This paper significantly advances computer-aided sketching technology by innovatively integrating reinforcement learning techniques. It addresses challenges like stroke decomposition and semantic fidelity, effectively bridging the gap between artificial intelligence and art. A major development is the introduction of an attention agent, meticulously designed to detect subtle changes on the canvas, thereby enhancing the accuracy and detail of the generated content. Additionally, the implementation of a sketching style feature extractor and a drawing agent, guided by the XDoG reward, ensures that the sketches produced are semantically accurate, stylistically consistent, and rich in detail. These advancements significantly boost the model’s capability to effectively process and interpret multimodal data, such as images and strokes. Overall, this research propels the field of multimedia processing forward by introducing a novel and effective method for transforming images into detailed stroke-based sketches, proving to be an invaluable tool for both artistic and technical applications.
Supplementary Material: zip
Submission Number: 1800
Loading