Parameter Efficient Multimodal Transformers for Video Representation Learning

Sangho Lee, Youngjae Yu, Gunhee Kim, Thomas M. Breuel, Jan Kautz, Yale Song

2021 (modified: 16 Nov 2021)ICLR 2021Readers: Everyone

Abstract: The recent success of Transformers in the language domain has motivated adapting it to a multimodal setting, where a new visual model is trained in tandem with an already pretrained language model....

0 Replies