Neural Image Compression with Multi-Scale Depthwise Separable Dilated Convolution and Multi-Distribution Mixture Entropy Model

Published: 2025, Last Modified: 06 Nov 2025DCC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recently, neural image compression (NIC) has made remarkable progress. Two key parts of NIC are the encoder-decoder and the entropy model. For the encoder-decoder, a larger effective receptive field (ERF) means a stronger transformation ability. Existing methods usually enlarge the ERF at the expense of complexity, which is intolerable. To address this issue, we propose a multi-scale depthwise separable dilated convolution (MSDSDC) to build the encoder-decoder. Specifically, we first construct a depthwise separable dilated convolution (DSDC) by using the depthwise separable strategy in dilated convolution to reduce its complexity. Subsequently, multi-scale features extracted by three DSDCs with varying dilation rates are fused to expand the ERF of the encoder-decoder, consequently enhancing its transformation capability. Besides, we design a multi-distribution mixture entropy model (MDMEM) to further enhance the flexibility of latent representation probability modeling. The experimental results demonstrate that our proposed method achieves the best balance between rate-distortion performance and complexity.
Loading