LLM-augmented entity alignment: an unsupervised and training-free framework

Meixiu Long, Jiahai Wang, Junxiao Ma, Jianpeng Zhou, Siyuan Chen

Published: 2026, Last Modified: 13 Mar 2026Neural Networks 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Entity alignment (EA) is a fundamental task in knowledge graph (KG) integration, aiming to identify equivalent entities across different KGs for a unified and comprehensive representation. Recent advances have explored pre-trained language models (PLMs) to enhance the semantic understanding of entities, achieving notable improvements. However, existing methods face two major limitations. First, they rely heavily on human-annotated labels for training, leading to high computational costs and poor scalability. Second, some approaches use large language models (LLMs) to predict alignments in a multi-choice question format, but LLM outputs may deviate from expected formats, and predefined options may exclude correct matches, leading to suboptimal performance. To address these issues, we propose LEA, an LLM-augmented entity alignment framework that eliminates the need for labeled data and enhances robustness by mitigating information heterogeneity at both embedding and semantic levels. LEA first introduces an entity textualization module that transforms structural and textual information into a unified format, ensuring consistency and improving entity representations. It then leverages LLMs to enrich entity descriptions, enhancing semantic distinctiveness. Finally, these enriched descriptions are encoded into a shared embedding space, enabling efficient alignment through text retrieval techniques. To balance performance and computational cost, we further propose a selective augmentation strategy that prioritizes the most ambiguous entities for refinement. Experimental results on both homogeneous and heterogeneous KGs demonstrate that LEA outperforms existing models trained on 30 % labeled data, achieving a 30 % absolute improvement in Hit@1 score. As LLMs and text embedding models advance, LEA is expected to further enhance EA performance, providing a scalable and robust paradigm for practical applications. The code and dataset can be found at https://github.com/Longmeix/LEA.

External IDs:dblp:journals/nn/LongWMZC26