FRUNI and FTREE synthetic knowledge graphs for evaluating explainability

Published: 27 Oct 2023, Last Modified: 09 Nov 2023NeurIPS XAIA 2023EveryoneRevisionsBibTeX
TL;DR: We introduce two synthetic knowledge graphs (KGs) to help in the evaluation of explainer methods for KG completion.
Abstract: Research on knowledge graph completion (KGC)---i.e., link prediction within incomplete KGs---is witnessing significant growth in popularity. Recently, KGC using KG embedding (KGE) models, primarily based on complex architectures (e.g., transformers), have achieved remarkable performance. Still, extracting the \emph{minimal and relevant} information employed by KGE models to make predictions, while constituting a major part of \emph{explaining the predictions}, remains a challenge. While there exists a growing literature on explainers for trained KGE models, systematically exposing and quantifying their failure cases poses even greater challenges. In this work, we introduce two synthetic datasets, FRUNI and FTREE, designed to demonstrate the (in)ability of explainer methods to spot link predictions that rely on indirectly connected links. Notably, we empower practitioners to control various aspects of the datasets, such as noise levels and dataset size, enabling them to assess the performance of explainability methods across diverse scenarios. Through our experiments, we assess the performance of four recent explainers in providing accurate explanations for predictions on the proposed datasets. We believe that these datasets are valuable resources for further validating explainability methods within the knowledge graph community.
Submission Track: Full Paper Track
Application Domain: Information Retrieval
Survey Question 1: Our research explores improving the evaluation of explanation methods for complex models used in knowledge graph completion (KGC). We introduce two synthetic datasets, FRUNI and FTREE, to assess the effectiveness of explanation methods in explaining predictions reliant on indirectly connected links.We expect the introduced datasets to benefit the knowledge graph community, enabling researchers and practitioners to assess and improve the explainability of KGC models, ultimately advancing the field of knowledge graph completion.
Survey Question 2: The complexity of knowledge graph embedding models and knowledge graph datasets poses challenges for explainer methods and complicates the analysis of their strengths and weaknesses. Synthetic datasets stand out as invaluable assets in identifying their failure-cases, offering a controlled data generation process. This facilitates a meticulous assessment of explainers' performance across diverse scenarios, often unattainable with real-world datasets due to their inherent noise and complexity.
Survey Question 3: We compare the following explainers for KGC models Kelpie, Data Poisoning, and Criage.
Submission Number: 42