Abstract: Assessing and comparing the performances of explainable Artificial Intelligence (XAI) techniques for Deep Graph Networks (DGN) is an actively researched open problem. In particular, XAI techniques that assign an importance score to the graph nodes are evaluated with standard metrics such as accuracy, which in turn entails to label a certain number of nodes as relevant according to a hard threshold. This protocol is unsuited to real-world applications where the true number of relevant nodes cannot be known in the absence of a ground truth. We attempt to fill this gap by developing unsupervised methods able to assign relevance even when the ground truth is not accessible. Specifically, we empirically show that a simple clustering-based strategy is able to recover the correct number of relevant nodes in most of the analyzed cases. Furthermore, we show that the silhouette score (which measures the clustering fit) can be used to proxy the quality of the relevance attribution. This work is a preliminary step towards reliably assessing the explanation quality in realistic settings.
Loading