Shrinking the Size of Deep Extreme Multi-Label Classification

Marco Bornstein; Tahseen Rabbani; Brian Joseph Gravelle; Furong Huang

Shrinking the Size of Deep Extreme Multi-Label Classification

Marco Bornstein, Tahseen Rabbani, Brian Joseph Gravelle, Furong Huang

Published: 09 Oct 2024, Last Modified: 19 Nov 2024Compression Workshop @ NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Extreme multi-label classification, compression, label recovery

TL;DR: We train a neural network on compressed labels and recover the original labels during inference, reducing memory and computational costs.

Abstract: Training deep classifiers for Extreme Multi-Label Classification (XMC) is difficult due to the computational and memory costs caused by extremely large label sets. Traditionally, the final output layer of these deep classifiers scales linearly with the size of the label set, which is often in the millions for realistic settings. Reducing the size of deep classifiers for XMC is necessary to (i) train them more efficiently and (ii) deploy them within memory-constrained systems, such as mobile devices. We address the current limitations of deep classifiers by proposing a novel XMC method, DECLARE: Deep Extreme Compressed Labeling And Recovery Estimation. DECLARE compresses the labels into a smaller dimension, which reduces the training time and model-storage size. DECLARE retains enough information to recover the most-likely predicted labels in the original label space. Empirically, DECLARE compresses labels by up to 99.975% while outperforming uncompressed performance.

Submission Number: 22

Loading