Ontology-Based Post-Hoc Neural Network Explanations Via Simultaneous Concept Extraction

Andrew Ponomarev, Anton Agafonov

Published: 2023, Last Modified: 26 Jul 2025IntelliSys (2) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Ontology-based explanation techniques provide an explanation on how a neural network came to a particular conclusion using human-understandable terms and their formal definitions, which are encoded in the form of ontology. One of the promising directions in the area of ontology-based neural explanation is based on concept extraction—the process of establishing relationships between internal representations of the network and ontology concepts. Existing algorithms of concept extraction are search-based and require training multiple mapping networks for each concept, which may be time consuming. The paper proposes a method to build post-hoc ontology-based explanations by training a single multi-label concept extraction network, mapping activations of the specified “black box” network to the ontology concepts. The experiments with two public datasets show that the proposed method can generate accurate ontology-based explanations of a given network and requires significantly less time for concept extraction than existing algorithms.