UserBERT: Self-supervised User Representation Learning

Tianyu Li; Ali Cevahir; Derek Cho; Hao Gong; DuyKhuong Nguyen; Bjorn Stenger

UserBERT: Self-supervised User Representation Learning

Tianyu Li, Ali Cevahir, Derek Cho, Hao Gong, DuyKhuong Nguyen, Bjorn Stenger

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: user representations, representation learning, self-supervised learning, pretraining, transfer learning

Abstract: This paper extends the BERT model to user data for pretraining user representations in a self-supervised way. By viewing actions (e.g., purchases and clicks) in behavior sequences (i.e., usage history) in an analogous way to words in sentences, we propose methods for the tokenization, the generation of input representation vectors and a novel pretext task to enable the pretraining model to learn from its own input, omitting the burden of collecting additional data. Further, our model adopts a unified structure to simultaneously learn from long-term and short-term user behavior as well as user profiles. Extensive experiments demonstrate that the learned representations result in significant improvements when transferred to three different real-world tasks, particularly in comparison with task-specific modeling and representations obtained from multi-task learning.

One-sentence Summary: On pretraining user representations via self-supervision

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=zlDTTdFxqy

10 Replies

Loading