Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching

Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching

ACL ARR 2024 June Submission4546 Authors

16 Jun 2024 (modified: 02 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Entity matching (EM) is a critical step in entity resolution. Recently, entity matching based on large language models (LLMs) has shown great promise. However, current LLM-based entity matching approaches typically follow a binary matching paradigm that ignores the global consistency between record relationships. In this paper, we investigate various methodologies for LLM-based entity matching that incorporate record interactions from different perspectives. Specifically, we comprehensively compare three representative strategies: matching, comparing, and selecting, and analyze their respective advantages and challenges in diverse scenarios. Based on our findings, we further design a compound entity matching framework (ComEM) that leverages the composition of multiple strategies and LLMs. ComEM benefits from the advantages of different sides and achieves improvements in both effectiveness and efficiency. Experimental results verify that ComEM not only achieves significant performance gains on various datasets, but also reduces the cost of LLM-based entity matching for practical applications.

Paper Type: Long

Research Area: Information Retrieval and Text Mining

Research Area Keywords: Entity Matching, Entity Resolution, Record Linkage, Deduplication, LLM

Contribution Types: NLP engineering experiment, Approaches to low-resource settings

Languages Studied: English

Submission Number: 4546

Loading