From Semantics to Symbols: A Two-Stage Framework for Deconstructing LLM Reasoning into Concepts and Rules

Published: 29 Sept 2025, Last Modified: 12 Oct 2025NeurIPS 2025 - Reliable ML WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Neuro-symbolic AI, Concept-based interpretability, Differentiable logic (DNF), Concept bottleneck, Rule learning
TL;DR: A two-stage neuro-symbolic framework that learns high-level concepts from text and then induces sparse DNF logic over them, yielding competitive accuracy with faithful rule-level explanations.
Abstract: Large Language Models (LLMs) achieve remarkable performance but their opaque, black-box nature limits trust and hinders deployment in critical applications. This paper introduces CogN-Syn, a novel two-stage Cognitive Neuro-Symbolic framework designed to deconstruct the decision-making process of LLMs into human-understandable cognitive steps. Unlike methods that rely on post-hoc rationalizations or simple linear predictors, CogN-Syn first trains a Concept Encoder to map unstructured text to a well-defined, high-level conceptual vocabulary. Subsequently, a second stage learns sparse, symbolic logic rules over these concepts using a Differentiable Logic Layer. This decoupled training strategy mimics a cognitive process: from semantic perception (concepts) to symbolic reasoning (rules). Our framework not only achieves performance competitive with black-box models but also provides a unique three-tiered explanation, enabling clear diagnostics of model failure modes and taking a crucial step towards safer, more trustworthy AI.
Submission Number: 35
Loading