Abstract: Highlights•We propose a novel point-driven framework for inference-time image editing.•Our method edits objects from a few points, automatically identifying visual and semantic information.•We introduce a refined diffusion strategy to mitigate visual and semantic mismatches during editing.•Extensive experiments on single and multi-object images demonstrate our approach’s efficacy.
External IDs:dblp:journals/pr/WangYYZWCH26
Loading