Keywords: Algorithmic Reasoning, Inductive Inference, Spurious Correlation, Out-of-Distribution Generalization, Optimization
TL;DR: Introduce gradient frequency balancing to facilitate inductive inference in Algorithmic Reasoning.
Abstract: Inductive inference, or extrapolation of general rules from finite instances, is understood to be the foundation of human intelligence. Unfortunately, Deep Neural Networks (DNNs) struggle with inductive inference and thus fail to learn even the simplest algorithms in Algorithmic Reasoning (AR). Existing research efforts on AR with DNNs are limited to those on the architectural design for DNNs. In this study, we investigate the influence of optimization techniques on AR performance. Through toy experiments designed to understand an optimizer's susceptibility to shortcuts in AR, we reveal that Adam, the naive choice of optimization, is easily fooled by spurious correlations. To overcome this shortcoming of Adam, we propose a novel optimizer that avoids spurious correlations by balancing gradients of low- and high-frequencies (BGF). We present extensive experiments and analyses to demonstrate the broad and multifaceted advantages of BGF across various architectures and AR tasks. In particular, BGF expands the AR capability of all explored DNN models and even shows the potential to enable learning of tasks that they previously failed at. The observed success of BGF in climbing the Chomsky hierarchy underscores the importance of optimization for developing advanced artificial intelligence with DNNs.
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6081
Loading