Teacher Guided Architecture Search

Pouya Bashivan; Mark Tensen; James J DiCarlo

Teacher Guided Architecture Search

Pouya Bashivan, Mark Tensen, James J DiCarlo

27 Sept 2018 (modified: 22 Jun 2025)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Strong improvements in neural network performance in vision tasks have resulted from the search of alternative network architectures, and prior work has shown that this search process can be automated and guided by evaluating candidate network performance following limited training (“Performance Guided Architecture Search” or PGAS). However, because of the large architecture search spaces and the high computational cost associated with evaluating each candidate model, further gains in computational efficiency are needed. Here we present a method termed Teacher Guided Search for Architectures by Generation and Evaluation (TG-SAGE) that produces up to an order of magnitude in search efficiency over PGAS methods. Specifically, TG-SAGE guides each step of the architecture search by evaluating the similarity of internal representations of the candidate networks with those of the (fixed) teacher network. We show that this procedure leads to significant reduction in required per-sample training and that, this advantage holds for two different search spaces of architectures, and two different search algorithms. We further show that in the space of convolutional cells for visual categorization, TG-SAGE finds a cell structure with similar performance as was previously found using other methods but at a total computational cost that is two orders of magnitude lower than Neural Architecture Search (NAS) and more than four times lower than progressive neural architecture search (PNAS). These results suggest that TG-SAGE can be used to accelerate network architecture search in cases where one has access to some or all of the internal representations of a teacher network of interest, such as the brain.

Keywords: hyperparameter search, architecture search, convolutional neural networks

TL;DR: Faster architecture search by maximizing representational similarity with a teacher network

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/teacher-guided-architecture-search/code)

12 Replies

Loading