Pragmatic Image Compression for Human-in-the-Loop Decision-Making

Siddharth Reddy; Anca Dragan; Sergey Levine

Pragmatic Image Compression for Human-in-the-Loop Decision-Making

Siddharth Reddy, Anca Dragan, Sergey Levine

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 SpotlightReaders: Everyone

Keywords: image compression, human-in-the-loop learning, generative adversarial networks, sequential decision-making, pragmatics, generative models

TL;DR: We explore a new approach to lossy image compression that adapts compression to user behavior, optimizing reconstructions to be useful for downstream tasks instead of preserving visual appearance.

Abstract: Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it. However, the amount of information actually needed by the user for downstream tasks -- e.g., deciding which product to click on in a shopping website -- is likely much lower. To achieve this lower bitrate, we would ideally only transmit the visual features that drive user behavior, while discarding details irrelevant to the user's decisions. We approach this problem by training a compression model through human-in-the-loop learning as the user performs tasks with the compressed images. The key insight is to train the model to produce a compressed image that induces the user to take the same action that they would have taken had they seen the original image. To approximate the loss function for this model, we train a discriminator that tries to distinguish whether a user's action was taken in response to the compressed image or the original. We evaluate our method through experiments with human participants on four tasks: reading handwritten digits, verifying photos of faces, browsing an online shopping catalogue, and playing a car racing video game. The results show that our method learns to match the user's actions with and without compression at lower bitrates than baseline methods, and adapts the compression model to the user's behavior: it preserves the digit number and randomizes handwriting style in the digit reading task, preserves hats and eyeglasses while randomizing faces in the photo verification task, preserves the perceived price of an item while randomizing its color and background in the online shopping task, and preserves upcoming bends in the road in the car racing game.

Supplementary Material: zip

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Code: https://github.com/rddy/pico

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/pragmatic-image-compression-for-human-in-the/code)

17 Replies

Loading