Keywords: conditional computation, inference efficiency, parameter efficiency, large models
TL;DR: a parameter-efficient transfer learning method that also improves inference efficiency
Abstract: We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning method that also improves inference efficiency. CoDA generalizes beyond standard adapter approaches to enable a new way of balancing speed and accuracy using conditional computation.
Starting with an existing dense pretrained model, CoDA adds sparse activation together with a small number of new parameters and a light-weight training phase.
Our experiments demonstrate that the CoDA approach provides an unexpectedly efficient way to transfer knowledge.
Across a variety of language, vision, and speech tasks, CoDA achieves a 2x to 8x inference speed-up compared to the state-of-the-art Adapter approaches with moderate to no accuracy loss and the same parameter efficiency.
Submission Number: 14632
Loading