Enhancing Transformer-based Semantic Matching for Few-shot Learning through Weakly Contrastive Pre-training

Wei Yang, Tengfei Huo, Zhiqiang Liu

Published: 01 Jan 2024, Last Modified: 10 Feb 2025ACM Multimedia 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The task of text semantic matching focuses on measuring the semantic similarity between two texts and is widely applied in search and ranking scenarios. In recent years, pre-trained foundation models based on the Transformer architecture have demonstrated powerful semantic representation capabilities. The pipeline of fine-tuning pre-trained foundation models on downstream semantic matching tasks has achieved promising results and widespread adoption. However, practical downstream scenarios often face severe challenges in terms of data quality and quantity. Ensuring high-quality and large quantities of samples is often difficult. Current research on enhancing pre-trained models for few-shot text semantic matching tasks is still not advanced enough. Therefore, this paper focuses on providing a general enhancement scheme for few-shot text semantic matching tasks. Specifically, we propose an Enhancing Transformer-based Semantic Matching method for few-shot learning through weakly contrastive pre-training, which is named as SEMFormer. First, starting from the token-level and structural-level perspectives, we design a simple and low-cost data augmentation method to construct weakly supervised samples. Then, we use the global semantic representations to construct a contrastive objective from the relation-aspect perspective. Next, we design a contrastive objective based on the alignment-aspect, aiming to achieve effective semantic matching by optimizing the bidirectional semantic awareness between texts. We conducted comprehensive experiments based on five Chinese and English datasets. The experimental results validated that our proposed weakly contrastive pre-training augmentation method significantly improves model performance. Further experiments confirmed the effectiveness of our design. The source code is available at: https://github.com/llm-ml/SEMFormer.