IMProofBench: Benchmarking AI on Research-Level Mathematical Proof Generation

Johannes Schmitt, Gergely Bérczi, Jasper Dekoninck, Jeremy Feusi, Tim Gehrunger, Raphael Appenzeller, Jim Bryan, Niklas Canova, Timo de Wolff, Filippo Gaia, Michel van Garrel, Baran Hashemi, David Holmes, Aitor Iribar Lopez, Victor Jaeck, Martina Jørgensen, Steven Kelk, Stefan Kuhlmann, Adam Kurpisz, Chiara Meroni et al. (13 additional authors not shown)

Published: 2025, Last Modified: 29 Mar 2026CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading