Does equivariance matter at scale?

Published: 23 Oct 2024, Last Modified: 24 Feb 2025NeurReps 2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Geometric deep learning, equivariance, neural scaling laws
TL;DR: We study empirically how equivariant and non-equivariant networks scale with compute and training samples.
Abstract: Given large data sets and sufficient compute, is it beneficial to design neural architectures for the structure and symmetries of a problem, or is it more efficient to learn them from data? We study empirically how equivariant and non-equivariant networks scale with compute and training samples. Focusing on a benchmark problem of rigid-body interactions and general-purpose transformer architectures, we perform a series of experiments, varying the model size, training steps, and dataset size. We find evidence for three conclusions. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation closes this gap. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget. Finally, the optimal allocation of a compute budget onto model size and training duration differs between equivariant and non-equivariant models.
Submission Number: 23
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview