Using the Training History to Detect and Prevent Overfitting in Deep Learning Models

Hao Li; Gopi Krishnan Rajbahadur; Dayi Lin; Cor-Paul Bezemer; Zhen Jiang

Using the Training History to Detect and Prevent Overfitting in Deep Learning Models

Hao Li, Gopi Krishnan Rajbahadur, Dayi Lin, Cor-Paul Bezemer, Zhen Jiang

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: overfitting, early stopping, deep learning

TL;DR: We propose a time series based method to: (1) detect overfitting in a trained model, and (2) prevent overfitting from happening in training.

Abstract: Overfitting of deep learning models on training data leads to poor generalizability on unseen data. Overfitting can be (1) prevented (e.g., using dropout or early stopping) or (2) detected in a trained model (e.g., using correlation-based methods). We propose a method that can both detect and prevent overfitting based on the training history (i.e., validation losses). Our method first trains a time series classifier on training histories of overfit models. This classifier is then used to detect if a trained model is overfit. In addition, our trained classifier can be used to prevent overfitting by identifying the optimal point to stop a model's training. We evaluate our method on its ability to identify and prevent overfitting in real-world samples (collected from papers published in the last 5 years at top AI venues). We compare our method against correlation-based detection methods and the most commonly used prevention method (i.e., early stopping). Our method achieves an F1 score of 0.91 which is at least 5% higher than the current best-performing non-intrusive overfitting detection method. In addition, our method can find the optimal stopping point and avoid overfitting at least 32% earlier than early stopping and achieve at least the same accuracy (often better) as early stopping.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: General Machine Learning (ie none of the above)

18 Replies

Loading