Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs

Yikang Zhang; Zhuo Chen; Zhao Zhong

Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs

Yikang Zhang, Zhuo Chen, Zhao Zhong

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Abstract: In this paper, we propose a Collaboration of Experts (CoE) framework to pool together the expertise of multiple networks towards a common aim. Each expert is an individual network with expertise on a unique portion of the dataset, which enhances the collective capacity. Given a sample, an expert is selected by the delegator, which simultaneously outputs a rough prediction to support early termination. To make each model in CoE play its role, we propose a novel training algorithm that consists of three components: weight generation module (WGM), label generation module (LGM) and selection reweighting module (SRM). Our method achieves the state-of-the-art performance on ImageNet, 80.7% top-1 accuracy with 194M FLOPs. Combined with PWLU activation function and CondConv, CoE further achieves the accuracy of 80.0% with only 100M FLOPs for the first time. More importantly, CoE is hardware-friendly, achieving a 3~6x speedup compared with some existing conditional computation approaches. Experimental results on translation task also show the strong generalizability of CoE.

One-sentence Summary: The paper presents a system called Collaboration of Experts (CoE) in which expert networks are encouraged to focus on unique portions of the dataset.

22 Replies

Loading