Composable Image Coding for Machine via Task-oriented Internal Adaptor and External Prior

Published: 2023, Last Modified: 04 Nov 2024VCIP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Traditional image coding standards are typically optimized with a focus on human perception, which conflicts with the fact that most of the images are now analyzed by machines. To enable a variety of downstream intelligent tasks, contemporary approaches either utilize traditional codecs for image compression which are then used for task analysis, or develop a unified feature compression paradigm with deep learning techniques. However, they might suffer from accumulative errors and poor compatibility/generalization due to the conflict between standardized codecs and diverse machine tasks. We argue that a favorable image coding for machine (ICM) framework should have highly efficient adaptation capability, and take the ultimate task goals into account. Oriented at this, we propose a composable ICM solution dubbed Com-ICM, which develops plug-and-play lightweight internal adaptors injected into the codec architecture for efficient task transfer, and leverages off-the-shelf (large) models to provide external prior information for further task-oriented semantics learning. The internal adaptors (from the architectural aspect) and external priors (from the precondition aspect) complement each other, resulting in a mutually beneficial effect. We evaluate Com-ICM on diverse vision benchmarks, including image classification, object detection, and semantic segmentation, demonstrating its effectiveness and superiority. We are also actively submitting Com-ICM as a technical proposal to the international organization for standardization.
Loading