Keywords: deep learning
Abstract: Benchmarks only serve to measure what models are capable of now, not what they will be capable of in the future. We find that the ordering of acquired capabilities is remarkably consistent across large populations of AI models, which begs the question of whether one can forecast which specific examples and capabilities future models will solve next. We propose formalizing this problem into a new evaluation task called progress prediction: Can we forecast which unsolved problems will be solved next as future models improve? We find that progress is, in fact, predictable. Through an empirical study of hundreds of millions of predictions made by 1,000+ vision models and 1,600+ language models, we find that this predictability is possible due to the consistent order in which models acquire capabilities across architectures, datasets, and modalities.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 13922
Loading