MLIP Arena: Advancing Fairness and Transparency in Machine Learning Interatomic Potentials through an Open and Accessible Benchmark Platform

Published: 03 Mar 2025, Last Modified: 09 Apr 2025AI4MAT-ICLR-2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Submission Track: Paper Track (Tiny Paper)
Submission Category: Automated Synthesis
Keywords: machine-learning interatomic potentials, benchmark, simulations
Abstract: Machine learning interatomic potentials (MLIPs) have revolutionized molecular and materials modeling, but existing benchmarks suffer from data leakage, limited transferability, and an overreliance on error-based metrics tied to specific density functional theory (DFT) references. We introduce MLIP Arena, a benchmark platform that evaluates MLIPs based on physics awareness, chemical reactivity, stability under extreme conditions, and predictive capabilities for thermodynamic properties and physical phenomena. Our evaluation challenges previous assumptions about model architectures and performance. MLIP Arena provides a reproducible framework to guide MLIP development toward improved predictive accuracy and runtime efficiency while maintaining physical consistency. The Python package and online leaderboard are available at https://huggingface.co/spaces/atomind/mlip-arena
Submission Number: 57
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview