Toggle navigation
OpenReview
.net
Login
×
Go to
ICML 2022
homepage
Unified Scaling Laws for Routed Language Models
Aidan Clark
,
Diego de Las Casas
,
Aurelia Guy
,
Arthur Mensch
,
Michela Paganini
,
Jordan Hoffmann
,
Bogdan Damoc
,
Blake A. Hechtman
,
Trevor Cai
,
Sebastian Borgeaud
,
George van den Driessche
,
Eliza Rutherford
,
Tom Hennigan
,
Matthew J. Johnson
,
Albin Cassirer
,
Chris Jones
,
Elena Buchatskaya
,
David Budden
,
Laurent Sifre
,
Simon Osindero
et al. (6 additional authors not shown)
2022 (modified: 24 Apr 2023)
ICML 2022
Readers:
Everyone
Abstract:
The performance of a language model has been shown to be effectively modeled as a power-law in its parameter count. Here we study the scaling behaviors of Routing Networks: architectures that condi...
0 Replies
Loading