Keywords: Lean, mathlib, code search, informalization
TL;DR: We present Lean Finder, a semantic search engine for Lean and mathlib that understands mathematicians’ intents.
Abstract: We present Lean Finder, a semantic search engine for Lean and mathlib that understands mathematicians’ intents. Due to challenges of locating relevant theorems and the steep learning curve of Lean 4 language, the progress of formal theorem proving is slow by tedious human efforts. Recent Lean search engines, though helpful, only passively consider informalization of statements, largely overlooking the discrepancy from user queries in the real world. In contrast, we propose a user-centered semantic search tailored to the needs of working mathematicians. The key idea is to first analyze and cluster the semantics of public discussions on Lean, then fine-tune text embeddings on synthesized queries that simulate user intents. Our Lean Finder is thus encoded with a rich awareness of mathematicians’ intents from different perspectives. Evaluations on both real-world queries by mathematicians and informalized statements demonstrate that our Lean
Finder outperforms previous search engines by at least 19%. We promise to release both the code, model checkpoints, datasets, and the web service for our Lean Finder upon acceptance.
Submission Number: 147
Loading