Scalable Neural Theorem Proving on Knowledge Bases and Natural Language

Pasquale Minervini; Matko Bosnjak; Tim Rocktäschel; Edward Grefenstette; Sebastian Riedel

Scalable Neural Theorem Proving on Knowledge Bases and Natural Language

Pasquale Minervini, Matko Bosnjak, Tim Rocktäschel, Edward Grefenstette, Sebastian Riedel

27 Sept 2018 (modified: 03 Apr 2024)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Reasoning over text and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering. Transducing text to logical forms which can be operated on is a brittle and error-prone process. Operating directly on text by jointly learning representations and transformations thereof by means of neural architectures that lack the ability to learn and exploit general rules can be very data-inefficient and not generalise correctly. These issues are addressed by Neural Theorem Provers (NTPs) (Rocktäschel & Riedel, 2017), neuro-symbolic systems based on a continuous relaxation of Prolog’s backward chaining algorithm, where symbolic unification between atoms is replaced by a differentiable operator computing the similarity between their embedding representations. In this paper, we first propose Neighbourhood-approximated Neural Theorem Provers (NaNTPs) consisting of two extensions toNTPs, namely a) a method for drastically reducing the previously prohibitive time and space complexity during inference and learning, and b) an attention mechanism for improving the rule learning process, deeming them usable on real-world datasets. Then, we propose a novel approach for jointly reasoning over KB facts and textual mentions, by jointly embedding them in a shared embedding space. The proposed method is able to extract rules and provide explanations—involving both textual patterns and KB relations—from large KBs and text corpora. We show that NaNTPs perform on par with NTPs at a fraction of a cost, and can achieve competitive link prediction results on challenging large-scale datasets, including WN18, WN18RR, and FB15k-237 (with and without textual mentions) while being able to provide explanations for each prediction and extract interpretable rules.

Keywords: Machine Reading, Natural Language Processing, Neural Theorem Proving, Representation Learning, First Order Logic

TL;DR: We scale Neural Theorem Provers to large datasets, improve the rule learning process, and extend it to jointly reason over text and Knowledge Bases.

Data: [FB15k](https://paperswithcode.com/dataset/fb15k), [FB15k-237](https://paperswithcode.com/dataset/fb15k-237), [Kinship](https://paperswithcode.com/dataset/kinship), [WN18](https://paperswithcode.com/dataset/wn18), [WN18RR](https://paperswithcode.com/dataset/wn18rr)

17 Replies

Loading