Abstract: Large computer-understandable proofs consist of millions of intermediate
logical steps. The vast majority of such steps originate from manually
selected and manually guided heuristics applied to intermediate goals.
So far, machine learning has generally not been used to filter or
generate these steps. In this paper, we introduce a new dataset based on
Higher-Order Logic (HOL) proofs, for the purpose of developing new
machine learning-based theorem-proving strategies. We make this dataset
publicly available under the BSD license. We propose various machine
learning tasks that can be performed on this dataset, and discuss their
significance for theorem proving. We also benchmark a set of simple baseline
machine learning models suited for the tasks (including logistic regression
convolutional neural networks and recurrent neural networks). The results of our
baseline models show the promise of applying machine learning to HOL
theorem proving.
Conflicts: uibk.ac.at, google.com
11 Replies
Loading