Abstract: Author name disambiguation (AND) serves as a core component of modern academic search systems to curate author profiles and bibliometrics. Recently, language models (LMs) and graph neural networks (GNNs) have significantly pushed the frontier of modeling textual and relational information. However, their representation powers are not fully exploited to improve the accuracy of AND. In this work, we propose a unified model -- graph-enhanced language model (i.e., GAND) that enables joint modeling of the text information and relations between documents. Compared to the traditional contrastive loss, we develop a multi-task fine-tuning objective. This not only mitigates potential distribution shifts in testing data but also improves the efficiency of fine-tuning language models for AND.Experiments on two real datasets for name disambiguation demonstrate the superior performance of our approach over embedding-based approaches, fine-tuning LMs and OpenAI's text embeddings.
Paper Type: long
Research Area: NLP Applications
Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: English
0 Replies
Loading