Keywords: graph neural network, knowledge inconsistency
Abstract: Human knowledge can naturally be organized as multimodal graphs, with prime examples including research papers or Wikipedia pages. However, identifying information inconsistencies within such knowledge-intensive documents remains challenging. These inconsistencies can be explicit, such as numerical discrepancies between tables and their textual descriptions, or implicit, like differing conclusions presented at the beginning and end of an article. Large Language Models (LLMs) have shown great potential in detecting these types of inconsistencies. Nevertheless, their practical deployment is often hindered by limitations such as restricted context windows and high inference costs. Additionally, standard Retrieval-Augmented Generation (RAG) approaches struggle to effectively capture intricate reference relationships within multimodal graphs. To address these challenges, we propose Knowledge Debugger, an efficient Graph Neural Network (GNN)-based framework that can identify diverse types of knowledge inconsistencies in multimodal data. To evaluate the effectiveness of our method, we built a Multimodal Knowledge Debugging Benchmark (MKDB) including 3 modalities, 699 Wikipedia pages, more than 10000 research papers, and more than 10000 knowledge-debugging tasks with answers. With our approach, we leverage LLMs to generate high-quality labels for training multimodal GNNs. The trained GNNs demonstrate strong performance in consistency checking tasks on multimodal graphs. Specifically, we beat the best RAG methods by 11\% on node-level bug detection tasks. By employing GNNs, we significantly enhance system efficiency and scalability, enabling effective and practical inconsistency detection in complex multimodal knowledge structures.
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 14737
Loading