Robust BiPoly-Matching for Multi-Granular EntitiesDownload PDFOpen Website

Published: 01 Jan 2021, Last Modified: 11 Jul 2023ICDM 2021Readers: Everyone
Abstract: Entity matching across two data sources is a prevalent need in many domains, including e-commerce. Of interest is the scenario where entities have varying granularity, e.g., a coarse product category may match multiple finer categories. Previous work in one-to-many matching generally presumes the ‘one’ necessarily comes from a designated source and the ‘many’ from the other source. In contrast, we propose a novel formulation that allows concurrent one-to-many bidirectional matching in any direction. Beyond flexibility, we also seek matching that is more robust to noisy similarity values arising from diverse entity descriptions, by introducing receptivity and reclusivity notions. In addition to an optimal formulation, we also propose an efficient and performant heuristic. Experiments on multiple real-life datasets from e-commerce sources showcase the effectiveness and outperformance of our proposed algorithms over baselines.
0 Replies

Loading