Adaptive Semantic Compression: Compatible Bitstream for Scalable Human-Machine Perception Sample Adaption

Shaokang Wang, Dingquan Li, Guoqing Xiang, Jinchang Xu, Shanghang Zhang, Xiaodong Xie

Published: 2025, Last Modified: 05 Mar 2026ICME 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the development of visual analysis models, collaborative image compression for machine and human perception has brought new challenges to the optimization of algorithms. Existing optimization algorithms achieve this target through meticulously designed model structures and bitstream design. However, the difference in bitstream design makes it incompatible with trained and existing decoders, hindering its practicality. In this paper, we proposed the Adaptive Semantic Compression (ASC) framework to fine-tune pre-trained codec on individual samples to obtain scalable bitstreams in an intuitive yet effective way. First, to improve the efficiency of application in machine perception, we proposed the Latent Semantic Contraction (LSC) method to fine-tune the latent code while preserving the machine task performance of the decoded image. Second, to further optimize human perception, we proposed the Spatial-frequency Decoder Adaptation (SFDA) module. By compensating for distortion in the spatial and frequency domains, SFDA improves the humane perception quality of the reconstructed image. The bitstreams composed of LSC and SFDA can be decoded by existing decoders to reconstruct images, thus fully exploiting the performance of the existing model. We implemented our algorithm on different pre-trained compression models and verified the flexibility and compatibility on various test images. Experimental results show that the LSC module can save 24.97% to 29.10% of bitrates with machine perception performance. Furthermore, the application of SFDA brings a 3.16% gain in the BD-Rate with PSNR, up to 15.69%, compared to LSC.

External IDs:dblp:conf/icmcs/WangLXXZX25