Keywords: Grokking, training dynamics, high dimensional structure, partition, spline
TL;DR: We show that deep neural networks grok adversarial examples in a wide range of practical settings where grokking wasn't previously observed, and explain it via the training dynamics of deep network linear regions.
Abstract: Grokking, or {\em delayed generalization}, is a phenomenon where generalization in a deep neural network (DNN) occurs long after achieving near zero training error. Previous studies have reported the occurrence of grokking in specific controlled settings, such as DNNs initialized with large-norm parameters or transformers trained on algorithmic datasets. We demonstrate that grokking is actually much more widespread and materializes in a wide range of practical settings, such as training of a convolutional neural network (CNN) on CIFAR10 or a Resnet on Imagenette. We introduce the new concept of {\em delayed robustness}, whereby a DNN groks adversarial examples and becomes robust, long after interpolation and/or generalization. We develop an analytical explanation for the emergence of both delayed generalization and delayed robustness based on the {\em local complexity} of a DNN's input-output mapping. Our \textit{local complexity} measures the density of so-called ``linear regions’’ (aka, spline partition regions) that tile the DNN input space.
Student Paper: Yes
Submission Number: 44
Loading