What Makes a Machine Learning Task a Good Candidate for an Equivariant Network?

Published: 17 Jun 2024, Last Modified: 11 Jul 2024ICML 2024 Workshop GRaMEveryoneRevisionsBibTeXCC BY 4.0
Track: Extended abstract
Keywords: Equivariant architectures, parameter and training data scaling
TL;DR: We investigate the properties of a machine learning problem that suggest that an equivariant architecture is likely to be helpful.
Abstract: Because of the prevalence of symmetry in real-world data, the development of deep learning architectures that incorporate this structure into network design has become an important area of research. Empirically however, equivariant architectures tend to provide more benefit in some settings than others. Since the development of new equivariant layers is a substantial research task and existing equivariant architectures tend to be more complex and harder for the non-expert to work with, identifying those situations where architectural equivariance is likely to bring the most benefit is an important question for the practitioner. In this short paper we begin to explore this question. Our preliminary studies suggest that (i) equivariant architectures are more useful when groups are more complex and data is more high-dimensional, (ii) aligning the type of equivariance with the symmetries in the task brings the most benefit, (iii) equivariant architectures tend to be beneficial across data regimes, and (iv) equivariant architectures display similar scaling behavior (as a function of training set size) as non-equivariant architectures.
Submission Number: 39
Loading