Self-training with Modeling Ambiguous Data for Low-Resource Relation ExtractionDownload PDF


16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: We present a simple yet effective approach to improve the performance of self-training relation extraction in a low-resource scenario. The approach first classifies the auto-annotated instances into two groups: confident instances and uncertain instances, according to the probabilities predicted by a teacher model. In contrast to most previous studies, which mainly only use the confident instances for self-training, we make use of the uncertain instances. We propose a method to identify some ambiguous but useful instances from the uncertain instances. Then, we propose to utilize negative training for the ambiguous instances and positive training for the confident instances. Finally, they are combined in a joint-training manner to build a relation extraction system. Experimental results on two widely used datasets with low-resource settings demonstrate that this new approach indeed achieves significant and consistent improvements when compared to several competitive self-training systems.
0 Replies
