Does equivariance matter at scale?

Johann Brehmer; Sönke Behrends; Pim De Haan; Taco Cohen

Does equivariance matter at scale?

Johann Brehmer, Sönke Behrends, Pim De Haan, Taco Cohen

Published: 23 Oct 2024, Last Modified: 24 Feb 2025NeurReps 2024 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Geometric deep learning, equivariance, neural scaling laws

TL;DR: We study empirically how equivariant and non-equivariant networks scale with compute and training samples.

Abstract: Given large data sets and sufficient compute, is it beneficial to design neural architectures for the structure and symmetries of a problem, or is it more efficient to learn them from data? We study empirically how equivariant and non-equivariant networks scale with compute and training samples. Focusing on a benchmark problem of rigid-body interactions and general-purpose transformer architectures, we perform a series of experiments, varying the model size, training steps, and dataset size. We find evidence for three conclusions. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation closes this gap. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget. Finally, the optimal allocation of a compute budget onto model size and training duration differs between equivariant and non-equivariant models.

Submission Number: 23

Loading