TL;DR: We systematically explored scaling laws for primate vision models and discovered that neural alignment stops improving beyond a certain scale, even though behavior keeps aligning better.
Abstract: When trained on large-scale object classification datasets, certain artificial neural network models begin to approximate core object recognition behaviors and neural response patterns in the primate brain. While recent machine learning advances suggest that scaling compute, model size, and dataset size improves task performance, the impact of scaling on brain alignment remains unclear. In this study, we explore scaling laws for modeling the primate visual ventral stream by systematically evaluating over 600 models trained under controlled conditions on benchmarks spanning V1, V2, V4, IT and behavior. We find that while behavioral alignment continues to scale with larger models, neural alignment saturates. This observation remains true across model architectures and training datasets, even though models with stronger inductive biases and datasets with higher-quality images are more compute-efficient. Increased scaling is especially beneficial for higher-level visual areas, where small models trained on few samples exhibit only poor alignment. Our results suggest that while scaling current architectures and datasets might suffice for alignment with human core object recognition behavior, it will not yield improved models of the brain's visual ventral stream, highlighting the need for novel strategies in building brain models.
Lay Summary: How do we build artificial intelligence systems that see the world like humans do? In the brain, a network of regions called the ventral visual stream helps us recognize objects in our environment. Scientists use computer models called neural networks to mimic this brain system, hoping to better understand both artificial and biological vision. A common observation in machine learning is that scaling up (using larger models and more training data) leads to better performance. But does it also bring us closer to how the brain works?
We trained over 600 neural networks of varying sizes and on different amounts of images, then compared their internal activity and decision patterns to recordings from primate brains and behavioral tests. We fitted simple “scaling laws” to see how brain-alignment scores change as we increase model parameters, dataset size, and compute.
We found that although larger models and more data keep improving behavior (how well AI decisions match primate choices), the similarity of model activity to actual brain neurons levels off. This means that more data and compute alone won’t yield better brain-models, and future work should explore new architectures and training methods to truly capture how our visual system works.
Link To Code: https://github.com/epflneuroailab/scaling-primate-vvs
Primary Area: Applications->Neuroscience, Cognitive Science
Keywords: scaling laws, neural alignment, behavioral alignment, computer vision, primate visual ventral stream
Submission Number: 15849
Loading