Beyond Fine Tuning: A Modular Approach to Learning on Small Data

Aryk Anderson, Kyle Shaffer, Artem Yankov, Court Corley, Nathan Hodas

Nov 04, 2016 (modified: Nov 04, 2016) ICLR 2017 conference submission readers: everyone
  • Abstract: In this paper we present a technique to train neural network models on small amounts of data. Current methods for training neural networks on small amounts of rich data typically rely on strategies such as fine-tuning a pre-trained neural network or the use of domain-specific hand-engineered features. Here we take the approach of treating network layers, or entire networks, as modules and combine pre-trained modules with untrained modules, to learn the shift in distributions between data sets. The central impact of using a modular approach comes from adding new representations to a network, as opposed to replacing representations via fine-tuning. Using this technique, we are able surpass results using standard fine-tuning transfer learning approaches, and we are also able to significantly increase performance over such approaches when using smaller amounts of data.
  • TL;DR: A better way to do deep learning with small amounts of training data
  • Keywords: Deep learning, Supervised Learning, Transfer Learning
  • Conflicts:,