Abstract: Existing image editing models struggle to meet realworld demands; despite excelling in academic benchmarks,
we are yet to see them adopted to solve real user needs.
The datasets that power these models use artificial edits,
lacking the scale and ecological validity necessary to address the true diversity of user requests. In response, we
introduce REALEDIT, a large-scale image editing dataset
with authentic user requests and human-made edits sourced
from Reddit. REALEDIT contains a test set of 9.3K examples the community can use to evaluate models on real
user requests. Our results show that existing models fall
short on these tasks, implying a need for realistic training
data. So, we introduce 48K training examples, with which
we train our REALEDIT model. Our model achieves substantial gains—outperforming competitors by up to 165 Elo
points in human judgment and 92% relative improvement
on the automated VIEScore metric on our test set. We deploy our model back on Reddit, testing it on new requests,
and receive positive feedback. Beyond image editing, we explore REALEDIT ’s potential in detecting edited images by
partnering with a deepfake detection non-profit. Finetuning
their model on REALEDIT data improves its F1-score by
14 percentage points, underscoring the dataset’s value for
broad, impactful applications.
Loading