DualCL: Principled Supervised Contrastive Learning as Mutual Information Maximization for Text Classification
Abstract: Text classification is a fundamental task in web content mining. Although the existing supervised contrastive learning (SCL) approach combined with pre-trained language models (PLMs) has achieved leading performance in text classification, it lacks fundamental principles. Theoretically motivated by a derived lower bound of mutual information maximization, we propose a dual contrastive learning framework DualCL that satisfies three properties, i.e., parameter-free, augmentation-easy and label-aware. DualCL generates classifier parameters from the PLM and simultaneously uses them for classification and as augmented views of the input text for supervised contrastive learning. Extensive experiments conclusively demonstrate that DualCL excels in learning superior text representations and consistently outperforms baseline models.
Loading