Abstract: Entity Set Expansion(ESE) is an important task in natural language processing, which is dedicated to expanding new entities from the seed entity set that belongs to the same semantic class. The ESE task initially used Bootstrap’s method to iteratively generate entities, but this method would cause semantic drift issue. Some studies have introduced categories to guide entity generation, but the corpus does not participate in the generation of categories. This leads to a certain extent to the insufficient granularity and inaccuracy of the categories, and subsequently directly affects the generation of entities. To address these challenges, we introduce a text summarization model to fully mine the semantic information of entities and use double filtering to further enhance the semantic boundary of entities. In addition, we propose a two-stage framework to expand entities. We also provide a Chinese MOOC-ESE dataset consisting of 476 courses and 45938 entity concepts. Experimental results show that our method performs better than other baseline models in MAP evaluation.
Loading