Abstract: We introduce ADAACT, a novel optimization algorithm that adjusts learning rates according to activation variance. Our method enhances the stability of neuron outputs by incorporating neuron-wise adaptivity during the training process, which subsequently leads to better generalization— a complementary approach to conventional activation regularization methods. Experimental results demonstrate ADAACT’s competitive performance across standard image classification benchmarks. We evaluate ADAACT on CIFAR and ImageNet, comparing it with other state-of-the-art methods. Importantly, ADAACT effectively bridges the gap between the convergence speed of Adam and the strong generalization capabilities of SGD, all while maintaining competitive execution times.
Loading