Keywords: scientific artifact, automatic discovery, graph neural network
Abstract: Scientific artifacts, such as models and benchmarks, are the foundation of machine learning research. With the rapid growth of repositories like HuggingFace, researchers now have access to millions of high-quality artifacts contributed by different researchers, yet the challenge remains: how can we automatically discover the state-of-the-art (SOTA) model for a given benchmark, fully leveraging existing scientific artifacts? We address this task, abbreviated as automatic SOTA discovery, by first modeling HuggingFace as an artifact graph, where nodes represent models or benchmarks and edges capture their relationships, labeled with evaluation results. Within this graph, we formulate the automatic SOTA discovery as the process of identifying new unobserved links with high potential performance that could advance future research. To enable scalable and efficient discovery of SOTA artifact links, we propose ArtifactLinker, a two-stage framework for automatic SOTA discovery: (1) prediction, which identifies promising links with Graph Neural Networks (GNNs) or graph-augmented LLMs, and (2) verification, which validates promising predicted links through reproducible and automatic coding experiments and agents. To evaluate ArtifactLinker, we further propose ArtifactBench, collecting 1,372 models and 308 benchmarks for systematically measuring prediction and verification performance and helping to develop new SOTA discovery agents. Our key results indicate that the graph-based prediction module in ArtifactLinker is effective in prediction. Moreover, an automatic verification pipeline in ArtifactLinker can verify that the identified promising links indeed achieve high performance on existing benchmarks in a fully automatic way.
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 14660
Loading