Hyperparameter search for deep convolutional neural network using effect factors

Zhenzhen Li, Lianwen Jin, Chunlin Yang, Zhuoyao Zhong

2015 (modified: 04 Nov 2022)ChinaSIP 2015Readers: Everyone

Abstract: Learning a deep architecture involves a tough issue called hyperparameter search. This is especially the case for convolutional neural networks with a large number of hyperparameters. To solve this problem, we propose a tensor completion method to predict the best architecture configurations for convolutional neural networks. This method is based on a hypothesis that the generalization performance of a deep architecture is controlled by several effect factors, each of which is a function of hyperparameter of the deep architecture. Predicted generalization accuracy of the best configurations are checked by carrying out deep learning computation. Since generalization performance for a practical recognition task is always data- and code-dependent, we tried out our method on an open deep learning platform named Caffe, and we increased the generalization accuracy from 98.97% to around 99.25% on MNIST by replacing only five numbers.

0 Replies