Abstract: Advancements in generative models have promoted text- and image-based multi-context image generation. Brain signals, offering a direct representation of user intent, present new opportunities for image customization. However, it faces challenges in brain interpretation, cross-modal context fusion and retention. In this paper, we present MindCustomer to explore the blending of visual brain signals in multi-context image generation. We first design shared neural data augmentation for stable cross-subject brain embedding by introducing the Image-Brain Translator (IBT) to generate brain responses from visual images. Then, we propose an effective cross-modal information fusion pipeline that mask-freely adapts distinct semantics from image and brain contexts within a diffusion model. It resolves semantic conflicts for context preservation and enables harmonious context integration. During the fusion pipeline, we further utilize the IBT to transfer image context to the brain representation to mitigate the cross-modal disparity. MindCustomer enables cross-subject generation, delivering unified, high-quality, and natural image outputs. Moreover, it exhibits strong generalization for new subjects via few-shot learning, indicating the potential for practical application. As the first work for multi-context blending with brain signal, MindCustomer lays a foundational exploration and inspiration for future brain-controlled generative technologies.
Lay Summary: Imagine creating an image just by thinking about it. Brain signals, which reflect what we see or imagine, could make this possible. But using brain activity to guide image generation is extremely difficult — brain data is noisy, varies from person to person, and doesn’t easily match up with visual information like pictures or text.
Our research introduces MindCustomer, a system that blends brain signals with other visual clues (like text or images) to create customized images. First, we use a tool to help the system understand brain patterns more consistently across different people. Then, we developed a way to combine brain signals and visual inputs without conflicts, so the final image reflects both sources naturally.
MindCustomer generates high-quality, personalized images—even for new users with very little training data. As the first tool to fuse brain signals into multi-context image creation, it opens the door to brain-driven creative tools and future technologies that respond directly to human thoughts.
Primary Area: Applications->Neuroscience, Cognitive Science
Keywords: Human-Computer Interaction, Brain-Computer Interface, Generative AI
Submission Number: 5399
Loading