Position: Not All Explanations for Deep Learning Phenomena Are Equally Valuable

Alan Jeffares; Mihaela van der Schaar

Position: Not All Explanations for Deep Learning Phenomena Are Equally Valuable

Alan Jeffares, Mihaela van der Schaar

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 Position Paper Track oralEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: Deep learning phenomena do offer practical value, but we should carefully consider where exactly that value lies.

Abstract:

Developing a better understanding of surprising or counterintuitive phenomena has constituted a significant portion of deep learning research in recent years. These include double descent, grokking, and the lottery ticket hypothesis -- among many others. Works in this area often develop ad hoc hypotheses attempting to explain these observed phenomena on an isolated, case-by-case basis. This position paper asserts that, in many prominent cases, there is little evidence to suggest that these phenomena appear in real-world applications and these efforts may be inefficient in driving progress in the broader field. Consequently, we argue against viewing them as isolated puzzles that require bespoke resolutions or explanations. However, despite this, we suggest that deep learning phenomena do still offer research value by providing unique settings in which we can refine our broad explanatory theories of more general deep learning principles. This position is reinforced by analyzing the research outcomes of several prominent examples of these phenomena from the recent literature. We revisit the current norms in the research community in approaching these problems and propose practical recommendations for future research, aiming to ensure that progress on deep learning phenomena is well aligned with the ultimate pragmatic goal of progress in the broader field of deep learning.

Lay Summary:

In this paper, we examine the methodology being used to study a particular sub-area of deep learning research that focuses on so-called deep learning phenomena. This area addresses interesting and unusual behavior observed in neural networks that can be isolated and analyzed. Although these phenomena (such as double descent, grokking, and the lottery ticket hypothesis) are heavily studied, we argue that many of them are unlikely to occur in practical applications. As such, treating them as puzzles to be solved on their own may not be the most productive research strategy. Instead, we suggest that their main value lies in how they can help us test and refine broader theories about how deep learning works. We provide examples of how this approach has led to useful insights and propose practical recommendations for making research in this area more aligned with the goals of the wider field.

Verify Author Names: My co-authors have confirmed that their names are spelled correctly both on OpenReview and in the camera-ready PDF. (If needed, please update ‘Preferred Name’ in OpenReview to match the PDF.)

No Additional Revisions: I understand that after the May 29 deadline, the camera-ready submission cannot be revised before the conference. I have verified with all authors that they approve of this version.

Pdf Appendices: My camera-ready PDF file contains both the main text (not exceeding the page limits) and all appendices that I wish to include. I understand that any other supplementary material (e.g., separate files previously uploaded to OpenReview) will not be visible in the PMLR proceedings.

Latest Style File: I have compiled the camera ready paper with the latest ICML2025 style files <https://media.icml.cc/Conferences/ICML2025/Styles/icml2025.zip> and the compiled PDF includes an unnumbered Impact Statement section.

Paper Verification Code: NjI0M

Permissions Form: pdf

Primary Area: Research Priorities, Methodology, and Evaluation

Keywords: Deep learning phenomena, double descent, grokking, lottery ticket hypothesis

Submission Number: 347

Loading