We Should Chart an Atlas of All the World's Models

Published: 26 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 Position Paper TrackEveryoneRevisionsBibTeXCC BY-NC 4.0
Keywords: Weight space learning, Model Populations, Analyzing Model Collections, Model Tree, Model Atlas, Model Repository, Model Lineage
TL;DR: In this position paper, we advocate for systematically studying entire model populations, and argue that this requires charting them in a unified structure, the "Model Atlas".
Abstract: Public model repositories now contain millions of models, yet most remain undocumented and effectively lost: their capabilities, provenance, and constraints cannot be reliably determined. As a result, the field wastes training time and compute, propagates hidden biases, faces intellectual-property risks, and misses opportunities for model reuse and transfer. In this position paper, we advocate charting the world's model population in a unified structure we call the Model Atlas: a graph that captures models, their attributes, and the weight transformations connecting them. The Model Atlas enables applications in model forensics, meta-ML research, and model discovery, challenging tasks given today's unstructured model repositories. However, because most models lack documentation, large atlas regions remain uncharted. Addressing this gap motivates new machine learning methods that treat models themselves as data and infer properties such as functionality, performance, and lineage directly from their weights. We argue that a scalable path forward is to bypass the unique parameter symmetries that plague model weights. Charting all the world's models will require a community effort, and we hope its broad utility will rally researchers toward this goal.
Lay Summary: Modern AI systems are built by combining and reusing many pretrained models shared online. Public repositories like Hugging Face already host millions of these models, but most lack basic information about how they were trained, what data they used, or even what they are good at. As a result, vast amounts of knowledge and computing effort are effectively lost, researchers keep retraining similar models, struggle to reproduce results, and risk unknowingly spreading biases or violating licenses. We take inspiration from fields where shifting from isolated cases to populations unlocked progress: Darwin uncovered evolution by studying species collectively, and machine learning advanced when it turned from hand-crafted methods to training on large datasets. Yet this population lens had rarely been applied to AI models themselves. In this position paper, we call for systematically studying entire model populations through the Model Atlas-a kind of "map of all models" that charts how models evolve, relate, and can be reused. We outline practical steps and weight-space learning methods needed to build this global map together as a community.
Submission Number: 240
Loading