Abstract: Parameterized hypercomplex layers have recently emerged as very useful alternatives of standard neural network layers. They allow for the construction of extremely lightweight architectures, with little to no sacrifice of accuracy. We propose networks of Shared-Operation Parameterized Hypercomplex layers, where the operation parameterization is co-learned by all layers in tandem. In this manner, we mitigate the computational burden of operation parameterization, which grows cubically with respect to the hypercomplex dimension. We attain good word and character error rate at only a small fraction of the memory footprint of non-hypercomplex models as well as previous non-shared operation hypercomplex ones (up to \(-96.8\%\) size reduction).
Loading