Constructing Multiple High-Quality Deep Neural Networks: A TRUST-TECH Based ApproachDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Nonlinear Dynamical Systems, Global Optimization, Deep Neural Networks, Ensemble.
Abstract: The success of deep neural networks relied heavily on efficient stochastic gradient descent-like training methods. However, these methods are sensitive to initialization and hyper-parameters. In this paper, a systematical method for finding multiple high-quality local optimal deep neural networks from a single training session, using the TRUST-TECH (TRansformation Under Stability-reTaining Equilibria Characterization) method, is introduced. To realize effective TRUST-TECH searches to train deep neural networks on large datasets, a dynamic search paths (DSP) method is proposed to provide an improved search guidance in TRUST-TECH method. The proposed DSP-TT method is implemented such that the computation graph remains constant during the search process, with only minor GPU memory overhead and requires just one training session to obtain multiple local optimal solutions (LOS). To take advantage of these LOSs, we also propose an improved ensemble method. Experiments on image classification datasets show that our method improves the testing performance by a substantial margin. Specifically, our fully-trained DSP-TT ResNet ensmeble improves the SGD baseline by 20\% (CIFAR10) and 15\%(CIFAR100). Furthermore, our method shows several advantages over other ensembling methods.
One-sentence Summary: We propose a novel method of obtaining multiple diverse networks systematically in one training run.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=Ioqjv_GRM7
12 Replies

Loading