Abstract: Although the optimization induced deep equilibrium models (OptEqs) show the connection between the neural networks' structure and the designed hidden optimization problems (problems that the network's forward procedure tries to solve), we find that the linear kernels used in their hidden optimization problem hinder their performance since linear kernels cannot extract non-linear feature dependencies from the inputs. Inspired by the classical machine learning algorithms, we use the widely used Gaussian kernels to construct the OptEqs hidden optimization problem and then propose our deep equilibrium model named KerDEQ. With Gaussian kernels, it can extract the input features' non-linear information more efficiently compared with the original OptEqs. Furthermore, KerDEQ can be regarded as a weight-tied neural network with infinite width and depth, therefore it shows better performance. Apart from that, our KerDEQ also shows better uncertainty calibration properties and performs more stably under different corruptions, especially under noise credit to the Gaussian kernel hidden optimization problem and its induced structure. Furthermore, we also conduct various experiments to demonstrate the effectiveness and reliability of our KerDEQ.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
Supplementary Material: zip
5 Replies
Loading