Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Instruction-based image editing, which aims to modify the image faithfully towards instruction while preserving irrelevant content unchanged, has made advanced progresses. However, there still lacks a comprehensive metric for assessing the editing quality. Existing metrics either require high costs concerning human evaluation, which hinders large-scale evaluation, or adapt from other tasks and lose specified concerns, failing to comprehensively evaluate the modification of instruction and the preservation of irrelevant regions, resulting in biased evaluation. To tackle it, we introduce a new metric Balancing Preservation Modification (BPM), that tailored for instruction-based image editing by explicitly disentangling the image into editing-relevant and irrelevant regions for specific consideration. We first identify and locate editing-relevant regions, followed by a two-tier process to assess editing quality: Region-Aware Judge evaluates whether the position and size of the edited region align with instruction, and Semantic-Aware Judge further assesses the instruction content compliance within editing-relevant regions as well as content preservation within irrelevant regions, yielding comprehensive and interpretable quality assessment. Moreover, the editing-relevant region localization in BPM can be integrated into image editing approaches to improve the editing quality, manifesting its wild application. We verify the effectiveness of BPM metric on comprehensive instruction-editing data, and the re- sults show that we yield the highest alignment with human evaluation compared to existing metrics, indicating efficacy. The code is available at https://joyli-x.github.io/BPM/.
Lay Summary: Automatic AI image editing based on text instructions—like “make the sky look cloudy”—has seen big improvements. But judging how well these edits are done is still a challenge. Current evaluation methods are either expensive, relying on human reviewers, or not tailored to this specific task, failing to comprehensively evaluate editing quality with full utilization of crucial information from original image, edited image and intruction. To address this, we introduce a new evaluation method. It breaks the image into parts that should be changed based on the instruction and parts that shouldn’t. Then, it checks whether the edits were done in the right place and whether the final result matches the instruction, while ensuring the untouched parts remain consistent. Our method Balancing Preservation and Modification (BPM) gives a more accurate and interpretable way to assess edits—and even helps improve the editing process itself. Tests show BPM matches human judgment better than existing tools, making it a valuable resource for building and evaluating smarter image editing systems.
Link To Code: https://joyli-x.github.io/BPM/
Primary Area: Applications->Computer Vision
Keywords: Image Editing, Evaluation Metric
Submission Number: 3425
Loading