[Novel] XAlias: An Unsupervised Bilingual Entity Alias Discovery System with Multiple Sources

27 Jul 2023 (modified: 29 Aug 2023)Submitted to Wikidata Workshop 2023EveryoneRevisionsBibTeX
TL;DR: We propose XAlias, a bilingual alias discovery system that extracts alias from corpus and generates alias by PLMs.
Abstract: Alias is the divergent expressions of an entity. Entity's alias table is widely used in Entity Linking and other NLP tasks, where alias tables are either annotated manually or obtained from knowledge base. We investigate the possibility of discovering entity alias automatically and present the first multi-source Alias Discovery (AD) system for both English and Chinese without training process. Our system combines two AD branches: Alias Extraction that extracts alias from corpus and the corpus-free Alias Generation. We propose a new unsupervised algorithm for alias generation by prompting the Pre-trained language model. We release our bilingual alias discovery system XAlias, which provides three easy-to-use API, an online website and a demo video. Experiments demonstrate that XAlias achieves the same level time and space consumption as well as better hits performance compared to independent alias source. We hope the release of XAlias will benefit the downstream tasks.
Submission Number: 10