High-performance Visual Semantics Compression for AI-Driven Science

Boyuan Zhang, Luanzheng Guo, Jiannan Tian, Jinyang Liu, Daoce Wang, Fanjiang Ye, Chengming Zhang, Jan Strube, Nathan R. Tallent, Dingwen Tao

Published: 01 Jan 2025, Last Modified: 05 Mar 2025PPoPP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Scientific images play a crucial role in many experimental sciences; however, the large volumes of data generated present significant challenges. Effective image compression must be fast, achieve high compression ratios, and preserve critical domain-specific features. Existing compressors, such as JPEG and SZ, often distort important textures when operating at high compression ratios. Conversely, AI-based compressors offer superior image quality and higher compression ratios but are significantly slower than traditional methods. To address this trade-off, we developed ViSemZ, a high-performance AI-based compressor specifically designed to preserve visual semantics. Our approach enhances AI compression by integrating sparse encoding with variable-length integer truncation, optimized lossless encoding using bitshuffle and a decoupled lookback prefix-sum, and pipelining techniques to enable efficient data streaming and asynchronous processing. Evaluations on scientific datasets demonstrate that, at comparable compression ratios, ViSemZ achieves performance almost on par with existing AI-based compressors while delivering a 9.6× overall compression speedup. These results effectively bridge the performance gap between traditional and AI-based compression methods.