Abstract: In recent years, unsupervised sentence representation learning has made significant improvements through methods such as contrastive learning and ranking distillation. In contrastive learning, each sentence is categorized as either a positive or negative sample. The ranking distillation method can achieve a smooth sentence similarity by assigning a fine-grained ranking to the sentences. However, when dealing with multiple teacher models, they tend to assign equal weights to all of them during the distillation process, overlooking the potential differential impacts that each teacher model may have. In addressing this concern, we propose a novel multi-teacher ranking distillation approach based on reinforcement learning, which dynamically adjusts the weights of teacher models to further enhance the sentence representation capabilities of the student model. Experimental results demonstrate that the proposed approach surpasses the current state-of-the-art methods in most Semantic Textual Similarity (STS) tasks, showcasing its significant capability in unsupervised sentence representation learning. Our code is available at https://github.com/whstheny/RLRD.
Loading