MAP: A Model-agnostic Pretraining Framework for Click-through Rate PredictionOpen Website

Published: 01 Jan 2023, Last Modified: 17 Sept 2023KDD 2023Readers: Everyone
Abstract: With the widespread application of online advertising systems, click-through rate (CTR) prediction has received more and more attention and research. The most prominent features of CTR prediction are its multi-field categorical data format, and vast and daily-growing data volume (e.g., billions of user click logs). The large capacity of neural models helps digest such massive amounts of data under the supervised learning paradigm, yet they fail to utilize the substantial data to its full potential, since click signals are not sufficient enough for the model to learn capable representations of features and instances. The self-supervised learning paradigm provides a more promising pretrain-finetune solution to better exploit the large amount of user click logs and learn more robust and effective representations. However, current works on this line are still preliminary and rudimentary, leaving self-supervised learning for CTR prediction still an open question. To this end, we propose a Model-agnostic Pretraining (MAP) framework that applies feature corruption and recovery on multi-field categorical data, and more specifically, we derive two practical algorithms: masked feature prediction (MFP) and replaced feature detection (RFD). MFP digs into feature interactions within each instance through masking and predicting a small portion of input features, and we also introduce Noise Contrastive Estimation (NCE) to handle large feature spaces. RFD further turns MFP into a binary classification mode through replacing and detecting changes in input features, making it even simpler and more effective for CTR pretraining. Our extensive experiments on two real-world million-level datasets (i.e., Avazu, Criteo) demonstrate the advantages of these two methods over several strong baselines, and achieve new state-of-the-art in terms of both performance and efficiency for CTR prediction.
0 Replies

Loading