Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Prakash Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler

2022 (modified: 24 Apr 2023)ICLR 2022Readers: Everyone

Abstract: State-of-the-art models in natural language processing rely on separate rigid subword tokenization algorithms, which limit their generalization ability and adaptation to new settings. In this paper...

0 Replies