Enhancing Distantly Supervised Named Entity Recognition with Strong Label Guided Lottery Training

Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang

Published: 2024, Last Modified: 18 Jun 2024LREC/COLING 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In low-resource Named Entity Recognition (NER) scenarios, only a limited quantity of strongly labeled data is available, while a vast amount of weakly labeled data can be easily acquired through distant supervision. However, weakly labeled data may fail to improve the model performance or even harm it due to the inevitable noise. While training on noisy data, only certain parameters are essential for model learning, termed safe parameters, whereas the other parameters tend to fit noise. In this paper, we propose a noise-robust learning framework where safe parameters can be identified with guidance from the small set of strongly labeled data, and non-safe parameters are suppressed during training on weakly labeled data for better generalization. Our method can effectively mitigate the impact of noise in weakly labeled data, and it can be easily integrated with data level noise-robust learning methods for NER. We conduct extensive experiments on multiple datasets and the results show that our approach outperforms the state-of-the-art methods.