MMCTR: A Multi-Task Model for Short Video CTR Prediction with Multi-Modal Video Content Features

Jinshan Wang, Qianfang Xu, Qiang Wang, Zhongjian Lyu, Jiaxin Chen, Wenchao Xu

Published: 2019, Last Modified: 16 May 2025ICME Workshops 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper presents a multi-task model called MMCTR for short video CTR prediction. The input to this model consists of four parts: category features and long-term preference features from user behavior interaction data, multi-modal short video content features, and other features that are manually extracted. In this model, a weighted loss function is employed and the parameters of our model are updated by joint training. By this way, one of the tasks can utilize information of other tasks for further improving accuracy. In addition, a new hash mapping function is defined to compresses the number of category ids and optimizes the memory space. Experimental results on the dataset from the ICME 2019 Short Video Understanding Challenge show that MMCTR outperforms the tested state-of-the-art models.