Multi-Level Knowledge Graph for Formal Mathematics

Published: 21 Apr 2026, Last Modified: 08 May 2026LOBACHEVSKII JOURNAL OF MATHEMATICSEveryoneCC BY 4.0
Abstract: We present a multi-level knowledge graph for formal mathematics built on the SciLibRU ontology. This ontology is modular and is formalized in OWL and description logic. It strictly distinguishes three levels of description: the interpretation level, which captures what an object means; the representation level, which describes how this object is expressed; and the resource level, which specifies where the object is stored and in what form it exists. This separation yields an addressee-invariant knowledge space suitable for both human researchers and AI agents. The knowledge graph is materialized from the filtered Mathlib library which is a formalized mathematical library implemented in the Lean 4 proof assistant. It consists from approximately 190 000 statements with typed dependency edges and a domain taxonomy of 660 subclasses with provenance metadata. A multimodal embedding model trained on five types of mathematical objects correctly matches elements from different modalities 74% of the time on the first try. We demonstrate the practical utility of the approach through combined vector-graph structured lemmas retrieval for automated theorem proving with LLM DeepSeek- proover-v2 7B. On the MiniF2F benchmark, which includes 488 tasks and 50,752 runs, our system significantly outperforms three state-of-the-art search engines for Lean: LeanSearch, LeanFinder, and LeanExplore. All observed improvements are statistically significant. Our vector search and LeanSearch produce statistically indistinguishable results, with a p-value close to 1. This confirms that the observed improvement is driven by the use of graph structure, specificallythesymboliclayer, aswellasontology-basedhintretrievalandcategorization, rather than by embedding quality alone.
Loading