Decay No More

Published: 02 May 2023, Last Modified: 02 May 2023Blogposts @ ICLR 2023Readers: Everyone
Keywords: weight decay, adam, optimization
Abstract: Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.
Blogpost Url: https://iclr-blogposts.github.io/2023/blog/2023/adamw/
ICLR Papers: https://openreview.net/forum?id=Bkg6RiCqY7, https://openreview.net/forum?id=B1lz-3Rct7
ID Of The Authors Of The ICLR Paper: ~Ilya_Loshchilov1, ~Guodong_Zhang1
Conflict Of Interest: No
5 Replies

Loading