Integrated Global-Local Gaussian Attention for Image Compression

Atefeh Khoshkhahtinat, Piyush M. Mehta

Published: 2025, Last Modified: 02 Mar 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Transformers have demonstrated outstanding performance in learned image compression (LIC) due to their high capacity for modeling complex dependencies. However, existing methods employ window-based attention mechanisms to reduce overhead, which limits the long-range dependency modeling capabilities of transformers. In this study, we present a novel transformer block designed to efficiently leverage global spatial information while keeping computational costs low. Additionally, our proposed transformer incorporates a local attention module that utilizes Gaussian positional encoding to enhance spatial awareness. These improvements contribute to achieving less redundant latent representations and improving compression efficiency. In addition to enhance the decorrelating capability of the transformation component, we propose a novel channel-wise entropy model that effectively leverages channel dependencies, thereby providing a more accurate estimation of the latent space distribution. Experimental results indicate that our framework surpasses both conventional and neural codecs.

External IDs:dblp:conf/icassp/KhoshkhahtinatM25