Abstract: Traditional deep model optimization methods discard the training weights can which contain information about the loss landscape that could guide further model optimization. In this paper, we show that a supervisor neural network could be used to predict the validation performance of another target neural network (student) through its training weights. Then based on this behavior, we propose a weight-loss pair-based training framework called REVAL to help decrease overfitting and increase the performance of the student model by using a supervisor model to learn the trajectories of weight updates of the student model. We conduct experiments on the MNIST, CIFAR10, and CIFAR-100 datasets with naive neural networks and show that we can improve the model classification performance with such simple network structures and training trajectories. Our code and models are available at https://github.con zhangenzhi/ rebyval.
0 Replies
Loading