Abstract: The performance of a deep network grows with the size of the network
and the training data in a predictable fashion. This has led to very large
networks that require ever increasing memory and power. Several
studies have reported that learning generates redundant nodes that, in
principle, can be removed to produce more compact networks. Using the
concept of fibration symmetry from category theory, we propose an
exact algorithm to identify the neurons that execute redundant
computations, based on the weights of the network alone. We report
here that such fibration symmetries emerge in many of the major
network architectures.
By pruning these redundant nodes, we achieve nearly lossless
compression at scale: 31$\times$ compression of over-parameterized
Transformers while improving their parameter scaling law;
MLPs and CNNs reduced to 17-20\% of their original size; and LSTMs
in reinforcement learning reduced to 20\% of their parameters with
no loss in return.
Fibration compression is complementary to
existing quantization methods. Together, these methods may allow for
the deployment of powerful models on edge devices.
Loading