Normalisation is dead, long live normalisation!

Anonymous

Published: 28 Mar 2022, Last Modified: 05 May 2023BT@ICLR2022Readers: Everyone
Keywords: normalization, initialization, propagation, skip connections, residual networks
Abstract: Since the advent of Batch Normalisation (BN) almost every state-of-the-art (SOTA) method uses some form of normalisation. After all, normalisation generally speeds up learning and leads to models that generalise better than their unnormalised counterparts. This turns out to be especially useful when using some form of skip connections, which are prominent in Residual Networks (ResNets), for example. However, Brock et al. (2021a) suggest that SOTA performance can also be achieved using ResNets without normalisation!
Submission Full: zip
Blogpost Url: yml
ICLR Paper: https://openreview.net/forum?id=IX3Nnir2omJ
2 Replies

Loading