Channel-Max, Channel-Drop and Stochastic Max-pooling

Yuchi Huang, Xiuyu Sun, Ming Lu, Ming Xu

2015 (modified: 10 Nov 2022)CVPR Workshops 2015Readers: Everyone

Abstract: We propose three regularization techniques to overcome drawbacks of local winner-take-all methods used in deep convolutional networks. Channel-Max inherits the max activation unit from Maxout networks, but otherwise adopts complementary subsets of input and filters with different kernel sizes as better companions to the max function. To balance the training on different pathways, Channel-Drop is employed to randomly discard half pathways before their inputs are convolved respectively. Stochastic Max-pooling is defined to reduce the overfitting caused by conventional max-pooling, in which half activations are randomly dropped in each pooling region during training and top largest activations are probabilistically averaged during testing. Using Channel-Max, Channel-Drop and Stochastic Max-pooling, we demonstrate state-of-the-art performance on four benchmark datasets: CIFAR-10, CIFAR-100, STL-10 and SVHN.

0 Replies