Keywords: name disambiguation, multi-task learning
TL;DR: We present BOND, an end-to-end from-scratch name disambiguation method that supersedes the conventional two-stage clustering paradigm by integrating it within a bootstrapping multi-task framework.
Abstract: From-scratch name disambiguation is an essential task for establishing a reliable foundation for academic platforms. It involves partitioning documents authored by identically named individuals into groups representing distinct real-life experts.
Canonically, the process is divided into two decoupled tasks: locally estimating the pairwise similarities between documents followed by globally grouping these documents into appropriate clusters.
However, such a decoupled approach often inhibits optimal information exchange between these intertwined tasks.
Therefore, we present BOND, which bootstraps the local and global informative signals to promote each other in an end-to-end regime.
Specifically, BOND harnesses local pairwise similarities to drive global clustering, subsequently generating pseudo-clustering labels. These global signals further refine local pairwise characterizations.
The experimental results establish BOND's superiority, outperforming other advanced baselines by a substantial margin.
Moreover, an enhanced version, BOND+, incorporating ensemble and post-match techniques, rivals the top methods in the WhoIsWho competition.
Track: Web Mining and Content Analysis
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: Yes
Submission Number: 1585
Loading