Abstract: In information retrieval, resolving the ambiguity of author names in digital libraries is essential given the widespread problem of identical names. Existing methods typically involve three steps: deriving publication embeddings, calculating publication similarities, and clustering for disambiguation. However, these traditional methods often fail to fully exploit textual and relational features. Furthermore, they tend to separate the optimization of representation learning from the disambiguation process. To overcome these limitations, we propose a novel framework namely ANDI(a joint disambiguation framework integrating Author Name DIsambiguation Goals). Specifically, we first employ a natural language model for textual feature extraction and a relational model for graph representation learning. Furthermore, a similarity learning model is added to each of these embedding models, optimizing them in line with the goal of author disambiguation. Finally, a clustering and post-match module utilizes the similarities of publication to complete the disambiguation process. Our evaluation of the proposed framework on three public datasets shows that it performs better compared to several state-of-the-art methods. Code is available at https://github.com/Alohalt/ANDI.
Loading