Keywords: Graph-based loss, Language model fine-tuning, Label propagation(LPA), Semi-supervised learning(SSL), Text classification
Abstract: Traditional loss functions (cross-entropy, contrastive, triplet, and supervised contrastive) for fine-tuning pre-trained language models, such as BERT, operate in the local neighborhood, overlooking the holistic structure of semantic relationships. Thus, we propose G-Loss, a novel graph-guided loss function that uses semi-supervised label propagation and leverages the structural relationships in the embedding manifold using graphs. G-Loss constructs a document similarity graph to capture global semantic relationships, guiding the language model to learn more discriminative and robust embeddings. We evaluated G-Loss on five classification benchmark datasets: MR (sentiment), R8 and R52 (topic), Ohsumed (medical), and 20NG (news). G-Loss-tuned models closely match or outperform those trained with traditional losses, showing an improvement of 0.02% to 1.06% in accuracy and 0.2% to 1.04% in macro F1-score across datasets. Additionally, G-Loss converges in fewer epochs as compared to other losses, highlighting its training efficiency. These results demonstrate that G-Loss not only improves classification performance, but also produces semantically coherent embedding space for diverse text classification tasks.
Submission Type: Full paper proceedings track submission (max 9 main pages).
Submission Number: 146
Loading