End-to-End RGB-IR Joint Image Compression with Channel-wise Cross-modality Entropy Model

wang haofeng

Published: 30 Oct 2024, Last Modified: 31 Oct 2024OpenReview Archive Direct UploadEveryoneCC BY-NC-ND 4.0

Abstract: RGB-IR image pairs are frequently applied simul- taneously in various applications like intelligent surveillance. However, as the number of modalities increases, the required data storage and transmission costs also doubles. Therefore, efficient RGB-IR data compression is essential. This work proposes a joint compression framework for RGB-IR image pair. Specifically, to fully utilize cross-modality prior information for accurate context probability modeling within and between modalities, we propose a Channel-wise Cross-modality Entropy Model (CCEM). Among CCEM, a Low-frequency Context Extraction Block (LCEB) and a Low-frequency Context Fusion Block (LCFB) are designed for extracting and aggregating the global low-frequency information from both modalities, which assist the model in predicting entropy parameters more accurately. Experimental results demonstrate that our approach outperforms existing single-modality image compression methods on LLVIP dataset. Compared to MLIC++, the best-performing image codec on the Kodak dataset, our proposed framework achieves a bit rate saving of 14.6% for RGB-IR pair.