Underfitting and Regularization: Finding the Right Balance
Keywords: underfitting, weight sharing, neural architecture search, knowledge distillation
Abstract: In this blog post, we will go over the ICLR 2022 paper titled NETWORK AUGMENTATION FOR TINY DEEP LEARNING. This paper introduces a new training method for improving the performance of tiny neural networks. Network Augmentation aka NetAug augments the network (reverse dropout), it puts the tiny model into larger models and encourages it to work as a sub-model of larger models to get extra supervision. NetAug caters to training small neural network architectures like MobileNetV2-Tiny for the best top-k percent accuracy. The paper argues that training small neural networks technically differs from that of large neural networks because the former is prone to underfitting. NetAug is contrary to traditional methods like dropout, network pruning, quantization, data augmentation, and other regularization techniques. NetAug can be viewed as a reversed form of dropout, as we enlarge the target model during the training phase instead of shrinking it. In this blog post, we identify some pitfalls with NetAug and propose potential workarounds based on knowledge distillation and neural architecture search.
Blogpost Url: https://iclr-blogposts.github.io/staging/blog/2023/Underfitting-and-Regularization-Finding-the-Right-Balance/
ICLR Papers: https://openreview.net/pdf?id=TYw3-OlrRm-
ID Of The Authors Of The ICLR Paper: ~Han_Cai1, ~Chuang_Gan1, ~Ji_Lin1, ~song_han1
Conflict Of Interest: No
4 Replies
Loading