Decision: conferencePoster
Abstract: Entity resolution, the task of automatically determining which
mentions refer to the same real-world entity, is a crucial aspect of
knowledge base construction and management. However, performing
entity resolution at large scales is challenging because (1) the
inference algorithms must cope with unavoidable system scalability
issues and (2) the search space grows exponentially in the number of
mentions. Current conventional wisdom declares that performing
coreference at these scales requires decomposing the problem by
first solving the simpler task of entity-linking (matching a set of
mentions to a known set of KB entities), and then performing entity
discovery as a postprocessing step (to identify new entities not
present in the KB). However, we argue that this traditional approach
is harmful to both entity-linking and overall coreference
accuracy. Therefore, we embrace the challenge of jointly model
entity-linking and entity-discovery as a single entity resolution
problem. In order to achieve scalability we (1) present a model that
reasons over compact hierarchical entity representations, and (2)
propose a novel distributed inference architecture that does not
suffer from the synchronicity bottleneck which is inherent in
map-reduce architectures. We demonstrate that more test-time data
actually improves the accuracy of coreference, and show that the
joint approach to coreference is substantially more accurate than
traditional entity-linking, reducing error by over 75%.
4 Replies
Loading