Almost-universal invariants for machine learning

Ben Blum-Smith

Almost-universal invariants for machine learning

Ben Blum-Smith

Published: 25 Mar 2025, Last Modified: 20 May 2025SampTA 2025 InvitedTalkEveryoneRevisionsBibTeXCC BY 4.0

Session: Invariant theory for machine learning (Dustin Mixon, Soledad Villar)

Keywords: point clouds; invariants; galois theory; generic separation; equivariant learning

TL;DR: By mildly relaxing universality, we can learn almost all O(d) and S_n-invariant functions on point clouds with an architecture not much bigger than the point cloud.

Abstract:

A standard way to assert that a machine learning (ML) architecture is expressive, is to prove that it has a universal approximation guarantee, i.e., that it can approximate functions in a given target class to arbitrary precision on compact subsets of the data space. In invariant and equivariant ML, the function class in question respects some built-in symmetry. Such function classes tend to have a more complicated structure than unrestricted function classes; as a result, architectures may need to be large (in relation to the data) in order to achieve universal approximation guarantees.

In this talk, I discuss a workaround: jettisoning a small (measure-zero) subset of the data space in order to lower the computational cost of achieving universal approximation on the rest.

The technique is illustrated on a type of symmetry relevant to point cloud data. Physical systems represented by point clouds tend to be symmetric with respect simultaneously to Euclidean isometries of space and also relabelings of the points. The known universal architectures that respect both these symmetries at once are prohibitively large when the point cloud is large; but giving up a measure-zero set of "bad" point clouds, we can achieve universality on the remaining ones with an architecture not much bigger than the input data. Joint work with Ningyuan (Teresa) Huang, Marco Cuturi, and Soledad Villar.

Submission Number: 25

Loading