Exploring the Limits of Semantic Image Compression at Micro-bits per Pixel

Published: 19 Mar 2024, Last Modified: 08 Apr 2024Tiny Papers @ ICLR 2024 NotableEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Semantic Compression Image Textual Coding
Abstract: Traditional methods, such as JPEG, perform image compression by operating on structural information, such as pixel values or frequency content. These methods are effective to bitrates around one bit per pixel (bpp) and higher at standard image sizes. However, to compress further text-based semantic compression directly stores concepts and their relationships using natural language, which has evolved with humans to efficiently represent these salient concepts. These methods can operate at extremely low bitrates by disregarding structural information like location, size, and orientation. In this work, we use GPT-4V and DALL-E3 from OpenAI to explore the quality-compression frontier for image compression and identify the limitations with current technology. We push semantic compression as low as 100 μbpp (up to 10,000× smaller than JPEG) by introducing an iterative reflection process to improve the decoded image. We further hypothesize this 100 μbpp level represents a soft limit on semantic compression at standard image resolutions.
Supplementary Material: zip
Submission Number: 149
Loading