Deconstructing the Reasoning Process of a Neuro-Fuzzy Agent: From Learned Concepts to Natural Language Narratives

Published: 23 Sept 2025, Last Modified: 17 Feb 2026CogInterp @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Cognitive Interpretability, Neuro-Fuzzy Systems, Concept Formation, Rule-Based Reasoning, Glass-Box Models
Abstract: A key goal in AI is to understand the internal cognitive processes that drive model decisions by analyzing their underlying algorithms and representations. We present a neuro-fuzzy framework designed to instantiate and analyze a complete cognitive pipeline within a "glass-box" agent. Our framework provides a transparent, multi-level cognitive account by showing how an agent: (1) develops its own perceptual concepts from raw data via regularized end-to-end learning; (2) processes information using these concepts in an explicit, dynamic symbolic reasoning algorithm; and (3) organizes its low-level processing into high-level behavioral strategies, which we reveal by abstracting thousands of raw rules into a handful of core "mental models". By modeling this entire pipeline, we offer a concrete methodology for building and dissecting AI systems whose learned cognitive processes are transparent by design.
Submission Number: 8
Loading