Keywords: entity linking, joint learning, multimodal learning, collaborative ranking
TL;DR: To tackle the multi-mention entity link problem, we propose a novel method, consisting of a context-entity joint feature extraction module, a multimodal learning framework, and a multi-mention collaborative ranking method with the pairwise training.
Abstract: Entity linking, bridging mentions in the contexts with their corresponding entities in the knowledge bases, has attracted wide attention due to many potential applications. Recently, plenty of multimodal entity linking approaches have been proposed to take full advantage of the visual information rather than solely the textual modality. Although feasible, these methods mainly focus on the single-mention scenarios and neglect the scenarios where multiple mentions exist simultaneously in the same context, which limits the performance. In fact, such multi-mention scenarios are pretty common in public datasets and real-world applications. To solve this challenge, we first propose a joint feature extraction module to learn the representations of context and entity candidates, which can take the multimodal information into consideration. Then, we design a pairwise training scheme (for training) and a multi-mention collaborative ranking method (for testing) to model the potential connections between different mentions. We evaluate our method on a public dataset and a self-constructed dataset, NYTimes-MEL, under both the text-only and multimodal settings. The experimental results demonstrate that our method can largely outperform the state-of-the-art methods, especially in multi-mention scenarios. Our dataset and source code are publicly available at https://github.com/ycm094/MMEL-main.
Supplementary Material: pdf
Other Supplementary Material: zip