From Morphemes to Knowledge Graphs: Enabling Abstractions in Large Language Models With Neurosymbolic AI

Thilini Wijesiriwardene, Krishnaprasad Thirunarayanan, Amit P. Sheth

Published: 2025, Last Modified: 21 Jan 2026IEEE Intell. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recent advances in large language models (LLMs) have revolutionized natural language processing, achieving impressive performance across a wide range of linguistic tasks. However, these successes often mask a critical limitation: Current evaluation paradigms provide little insight into how well LLMs handle linguistic abstractions, the very cognitive capability that underlies generalization, analogy-making, and systematic reasoning. Without a principled framework for evaluating abstraction, it remains unclear whether LLMs truly engage in abstractions and their nature, how consistently they do so, and to what extent these behaviors reflect genuine abstraction capabilities versus surface-level pattern matching. We propose a structured taxonomy of linguistic abstractions in natural language processing, spanning levels from morphology to knowledge graphs (KGs), organized along two key dimensions: linguistic granularity and contextual dependence. This taxonomy supports a more nuanced evaluation of LLMs’ abstraction capability and helps identify where current models fall short. In particular, we highlight the limitations of LLMs at higher levels of abstraction—such as semantic, topical, taxonomic, and KG levels—where relational composition, context sensitivity, and symbolic structures are critical. To remedy these weaknesses, we advocate for the integration of neurosymbolic artificial intelligence (AI) systems that combine neural representations with symbolic reasoning.

External IDs:dblp:journals/expert/WijesiriwardeneTS25