Abstract: Discovering new intents is crucial for expanding domains in dialogue systems or natural language understanding (NLU) systems. A typical approach is to leverage unsupervised and semi-supervised learning to train a neural encoder to produce representations of utterances that are adequate for clustering then perform clustering on the representations to detect unseen clusters of intents. Recently, instance-level contrastive learning has been proposed to improve representation quality for better clustering. However, the proposed method suffers from semantic distortion in text augmentation and even from representation inadequacy due to limitations of using representations of pre-trained language models, typically BERT. Neural encoders can be powerful representation learners, but the initial parameters of pre-trained language models do not reliably produce representations that are suitable for capturing semantic distances. To eliminate the necessity of data augmentation and reduce the negative impact of pre-trained language models as encoders, we propose UNICON, a novel contrastive learning method that utilizes auxiliary external representations to provide powerful guidance for the encoder. Neural encoders can be powerful representation learners, but the initial parameters of pre-trained language models do not reliably produce representations that are suitable for capturing semantic distances. To eliminate the necessity of data augmentation and reduce the negative impact of pre-trained language models as encoders, we propose UNICON, a novel contrastive learning method that utilizes auxiliary external representations to provide powerful guidance for the encoder.
0 Replies
Loading