DualCL: Principled Supervised Contrastive Learning as Mutual Information Maximization for Text Classification

Junfan Chen; Richong Zhang; Yaowei Zheng; Qianben Chen; Chunming Hu; Yongyi Mao

DualCL: Principled Supervised Contrastive Learning as Mutual Information Maximization for Text Classification

Junfan Chen, Richong Zhang, Yaowei Zheng, Qianben Chen, Chunming Hu, Yongyi Mao

Published: 01 Jan 2024, Last Modified: 06 Mar 2025WWW 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Text classification is a fundamental task in web content mining. Although the existing supervised contrastive learning (SCL) approach combined with pre-trained language models (PLMs) has achieved leading performance in text classification, it lacks fundamental principles. Theoretically motivated by a derived lower bound of mutual information maximization, we propose a dual contrastive learning framework DualCL that satisfies three properties, i.e., parameter-free, augmentation-easy and label-aware. DualCL generates classifier parameters from the PLM and simultaneously uses them for classification and as augmented views of the input text for supervised contrastive learning. Extensive experiments conclusively demonstrate that DualCL excels in learning superior text representations and consistently outperforms baseline models.

Loading