Training Dynamics of Learning 3D-Rotational Equivariance

TMLR Paper5696 Authors

21 Aug 2025 (modified: 30 Aug 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: While data augmentation is widely used to train symmetry-agnostic models, it remains unclear how quickly and effectively they learn to respect symmetries. We investigate this by deriving a principled measure of equivariance error that, for convex losses, calculates the percent of total loss attributable to imperfections in learned symmetry. We focus our empirical investigation to 3D-rotation equivariance on high-dimensional molecular tasks (flow matching, force field prediction, denoising voxels) and find that models rapidly become nearly equivariant within 1k-10k training steps, a result robust to model and dataset size. This happens because learning 3D-rotational equivariance is an easier learning task, with a smoother and better-conditioned loss landscape, than the main prediction task. We then theoretically characterize learning dynamics for models that are nearly equivariant, as ``stochastic equivariant learning dynamics'', via analyses that also hold beyond 3D rotations. For 3D rotations, the loss penalty for non-equivariant models is small throughout training, so they may achieve lower test loss than equivariant models per GPU-hour unless the equivariant ``efficiency gap'' is narrowed.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Bamdev_Mishra1
Submission Number: 5696
Loading