SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

Published: 02 Mar 2026, Last Modified: 02 Mar 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Equation discovery from data is a central challenge in machine learning for science, which requires the recovery of concise symbolic expressions that govern complex physical and geometric phenomena. Recent large language model (LLM) approaches have shown promise in symbolic regression, yet existing benchmarks predominantly evaluate low-dimensional scalar functions and rely on string-level or regression-based metrics that fail to capture structural and geometric equivalence. We introduce SURFACEBENCH, the first geometry-aware benchmark for symbolic discovery of three-dimensional surfaces. Unlike scalar curve-fitting tasks, SURFACEBENCH targets surface-level reasoning, where multi-variable coupling, coordinate transformations, and geometric structure must be inferred directly from data. The benchmark comprises 183 analytically constructed, science-inspired surface equations across 15 categories and three representation paradigms: explicit, implicit, and parametric forms. Each task includes variable semantics and synthetically sampled 3D data, and is designed to stress symbolic composition, structural ambiguity, and representational non-uniqueness while mitigating memorization. To evaluate discovery quality, SURFACEBENCH incorporates symbolic equivalence checks with geometric metrics of the object-space (Chamfer and Hausdorff distances) and regression-based error measures, allowing evaluation of functional fidelity beyond algebraic syntax. Empirical evaluation across evolutionary, neural, and LLM-driven frameworks reveals that no current method achieves consistent performance across representation types, with LLM-based approaches exhibiting strong structural priors but limited robustness in parameter calibration and multi-equation reasoning. SURFACEBENCH provides a challenging and diagnostic testbed that bridges symbolic reasoning and geometric reconstruction, enabling principled benchmarking of compositional generalization and structure-aware scientific induction in high-dimensional equation discovery. The code and data are available at this link: https://github.com/deep-symbolic-mathematics/surfacebench.
Submission Type: Regular submission (no more than 12 pages of main content)
Code: https://github.com/deep-symbolic-mathematics/surfacebench
Assigned Action Editor: ~Jeff_Phillips1
Submission Number: 6489
Loading