Evolving Neural Network's Weights at Imagenet Scale

Guodong DU; Senqiao Yang; Runhua Jiang; Shuyang Yu; Haoyang Li; Wei Chen; Keren Li; Ho-Kin Tang; Sim Kuan Goh

Evolving Neural Network's Weights at Imagenet Scale

Guodong DU, Senqiao Yang, Runhua Jiang, Shuyang Yu, Haoyang Li, Wei Chen, Keren Li, Ho-Kin Tang, Sim Kuan Goh

16 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: optimization, evolution

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Building upon evolutionary theory, this work proposes a deep neural network optimization framework based on evolutionary algorithms to enhance existing pre-trained models, usually trained by backpropagation (BP). Specifically, we consider a pre-trained model to generate an initial population of deep neural networks (DNNs) using BP with distinct hyper-parameters, and subsequently simulate the evolutionary process of DNNs. Moreover, we enhance the evolutionary process, by developing an adaptive differential evolution (DE) algorithm, SA-SHADE-tri-ensin, which integrates the strengths of two DE algorithms, SADE and SHADE, with trigonometric mutation and sinusoidal change of mutation rate. Compared to existing work (e.g., ensembling, weight averaging and evolution inspired techniques), the proposed method better enhanced existing pre-trained deep neural network models (e.g., ResNet variants) on large-scale ImageNet. Our analysis reveals that DE with an adaptive trigonometric mutation strategy yields improved offspring with higher success rates and the importance of diversity in the parent population. Hence, the underlying mechanism is worth further investigation and has implications for developing advanced neuro-evolutionary optimizers.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 690

Loading