MEDO: Minimizing Effective Distortions Only for Machine-Oriented Visual Feature Compression

Published: 01 Jan 2023, Last Modified: 19 May 2025VCIP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In search for efficient feature compression technologies for machine consumption, MPEG recently issued a call for proposal (CfP) on feature compression for video coding for machine (FCVCM). One issue in feature compression is that the input feature maps generally have high redundancy in them. Various researches to reduce such redundancy have been made. For example, a recent study called L-MSFC (learnable multi-scale feature compression), which effectively combines multi-scale feature fusion and compression in an end-to-end learnable framework, showed up to 98% BD rate gain over the anchor model defined in the FCVCM CfP. Despite these advances in FCVCM, relation between distortions in feature maps and performance of vision tasks has stayed relatively unexplored. In this paper, we propose a novel loss function called MEDO (minimizing effective distortions only) based on our hypothesis that distortions below some threshold do not improve task performance. Experimental results on instance segmentation task show that our MEDO loss on top of L-MSFC improves the overall rate-mAP performance without compromising complexity. Being more practical for real-world uses, we also present an extension to L-MSFC for variable-rate support with a single model.
Loading