Abstract: By mining rich semantic information from large-scale unlabeled texts and incorporating it into pre-trained models, BERT and RoBERTa have achieved impressive performance on many natural language processing tasks. However, these pre-trained models rely on fine-tuning for specific tasks, and it is very difficult to use native BERT or RoBERTa for the task of Semantic Textual Similarity (STS).
Loading