Abstract: Most existing weakly supervised aspect detection algorithms utilize pre-trained language models as their backbone networks by constructing discriminative tasks with seed words. Once the number of seed words decreases, the performance of current models declines significantly. Recently, prompt tuning has been proposed to bridge the gap of objective forms in pre-training and fine-tuning, which is hopeful of alleviating the above challenge. However, directly applying the existing prompt-based methods to this task not only fails to effectively use large amounts of unlabeled data, but also may cause serious over-fitting problems. In this paper, we propose a lightweight teacher-student network (PTS) based on prompts to solve the above two problems. Concretely, the student network is a hybrid prompt-based classification model to detect aspects, which innovatively compounds hand-crafted prompts and auto-generated prompts. The teacher network comprehensively considers the representation of the sentence and the masked aspect token in the template to guide classification. To utilize unlabeled data and seed words intelligently, we train the teacher and student network alternately. Furthermore, in order to solve the problem that the uneven quality of training data obviously affects the iterative efficiency of PTS, we design a general dynamic data selection strategy to feed the most pertinent data into the current model. Experimental results show that even given the minimum seed words, PTS significantly outperforms previous state-of-the-art methods on three widely used benchmarks.
Loading