Abstract: Recent research has demonstrated the significance of incorporating invariance into neural networks. However, existing methods require direct sampling over the entire transformation set, notably computationally taxing for large groups like the affine group. In this study, we propose a more efficient approach by addressing the invariances of the subgroups within a larger group. For tackling affine invariance, we split it into the Euclidean group E(n)<math><mrow is="true"><mi is="true">E</mi><mrow is="true"><mo is="true">(</mo><mi is="true">n</mi><mo is="true">)</mo></mrow></mrow></math> and uni-axial scaling group US(n)<math><mrow is="true"><mi is="true">U</mi><mi is="true">S</mi><mrow is="true"><mo is="true">(</mo><mi is="true">n</mi><mo is="true">)</mo></mrow></mrow></math>, handling invariance individually. We employ an E(n)<math><mrow is="true"><mi is="true">E</mi><mrow is="true"><mo is="true">(</mo><mi is="true">n</mi><mo is="true">)</mo></mrow></mrow></math>-invariant model for E(n)<math><mrow is="true"><mi is="true">E</mi><mrow is="true"><mo is="true">(</mo><mi is="true">n</mi><mo is="true">)</mo></mrow></mrow></math>-invariance and average model outputs over data augmented from a US(n)<math><mrow is="true"><mi is="true">U</mi><mi is="true">S</mi><mrow is="true"><mo is="true">(</mo><mi is="true">n</mi><mo is="true">)</mo></mrow></mrow></math> distribution for US(n)<math><mrow is="true"><mi is="true">U</mi><mi is="true">S</mi><mrow is="true"><mo is="true">(</mo><mi is="true">n</mi><mo is="true">)</mo></mrow></mrow></math>-invariance. Our method maintains a favorable computational complexity of O(N2)<math><mrow is="true"><mi mathvariant="script" is="true">O</mi><mrow is="true"><mo is="true">(</mo><msup is="true"><mrow is="true"><mi is="true">N</mi></mrow><mrow is="true"><mn is="true">2</mn></mrow></msup><mo is="true">)</mo></mrow></mrow></math> in 2D and O(N4)<math><mrow is="true"><mi mathvariant="script" is="true">O</mi><mrow is="true"><mo is="true">(</mo><msup is="true"><mrow is="true"><mi is="true">N</mi></mrow><mrow is="true"><mn is="true">4</mn></mrow></msup><mo is="true">)</mo></mrow></mrow></math> in 3D scenarios, in contrast to the O(N6)<math><mrow is="true"><mi mathvariant="script" is="true">O</mi><mrow is="true"><mo is="true">(</mo><msup is="true"><mrow is="true"><mi is="true">N</mi></mrow><mrow is="true"><mn is="true">6</mn></mrow></msup><mo is="true">)</mo></mrow></mrow></math> (2D) and O(N12)<math><mrow is="true"><mi mathvariant="script" is="true">O</mi><mrow is="true"><mo is="true">(</mo><msup is="true"><mrow is="true"><mi is="true">N</mi></mrow><mrow is="true"><mn is="true">12</mn></mrow></msup><mo is="true">)</mo></mrow></mrow></math> (3D) complexities of averaged models. Crucially, the scale range for augmentation adapts during training to avoid excessive scale invariance. This is the first time nearly exact affine invariance is incorporated into neural networks without directly sampling the entire group. Extensive experiments unequivocally confirm its superiority, achieving new state-of-the-art results in affNIST and SIM2MNIST classifications while consuming less than 15% of inference time and fewer computational resources and model parameters compared to averaged models.
Loading