CoTuning: A Large-Small Model Collaborating Distillation Framework for Better Model Generalization

Zimo Liu; Kangjun Liu; Mingyue Guo; Shiliang Zhang; Yaowei Wang

CoTuning: A Large-Small Model Collaborating Distillation Framework for Better Model Generalization

Zimo Liu, Kangjun Liu, Mingyue Guo, Shiliang Zhang, Yaowei Wang

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Model compression and distillation techniques have become essential for deploying deep learning models efficiently. However, existing methods often encounter challenges related to model generalization and scalability for harnessing the expertise of pre-trained large models. This paper introduces CoTuning, a novel framework designed to enhance the generalization ability of neural networks by leveraging collaborative learning between large and small models. CoTuning overcomes the limitations of traditional compression and distillation techniques by introducing strategies for knowledge exchange and simultaneous optimization. Our framework comprises an adapter-based co-tuning mechanism between cloud and edge models, a scale-shift projection for feature alignment, and a novel collaborative knowledge distillation mechanism for domain-agnostic tasks. Extensive experiments conducted on various benchmark datasets demonstrate the effectiveness of CoTuning in improving model generalization while maintaining computational efficiency and scalability. The proposed framework exhibits a significant advancement in model compression and distillation, with broad implications for research in the collaborative evolution of large-small models.

Primary Subject Area: [Experience] Multimedia Applications

Secondary Subject Area: [Experience] Multimedia Applications

Relevance To Conference: Multimedia applications frequently operate on devices with restricted computational capabilities, like smartphones, tablets, or embedded systems. Leveraging model compression techniques allows for the implementation of deep learning models that have minimized memory usage and computational requirements, rendering them better suited for environments with limited resources. Our method introduces a model compression technique characterized by strong generalization capabilities, promising significant advantages in such contexts.

Submission Number: 4095

Loading