OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Go to
EMNLP 2021
homepage
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent
William Merrill
,
Vivek Ramanujan
,
Yoav Goldberg
,
Roy Schwartz
,
Noah A. Smith
2021 (modified: 11 Nov 2021)
EMNLP (1) 2021
Readers:
Everyone
Abstract:
William Merrill, Vivek Ramanujan, Yoav Goldberg, Roy Schwartz, Noah A. Smith. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021.
0 Replies
Loading