MT-BICN: Multi-task Balanced Information Cascade Network for Recommendation

Haotian Wu, Yubo Gao

Published: 01 Jan 2023, Last Modified: 02 Aug 2025KSEM (3) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Multi-task learning (MTL) is a promising research direction in recommender systems, whose prediction accuracy greatly depends on the quality of the modeling of the relationships among tasks. Much of the prior research focus on three tasks: predicting click-through rate (CTR), post-view click-through & conversion rate (CTCVR), and post-click conversion rate (CVR), which rely on inherent user action pattern of impression \(\rightarrow \) click \(\rightarrow \) conversion. Information cascade pattern, represented by Adaptive Information Transfer Multi-task (AITM), attempts to model such sequential dependencies in the feature space close to the output for the first time. However, we observe that the first task in the information cascade model usually tends to be the victim, which is not in line with expectations. To this end, we propose a novel architecture: Multi-task Balanced Information Cascade Network (MT-BICN). We set up both shared experts and task-specific experts for each task to provide a bottom-line guarantee for each task’s performance, which largely reduces the risk of each task falling victim to the seesaw phenomenon. Information transfer unit (ITU) is designed and set at the output layer of the top tower to explicitly model the sequential dependencies among tasks. In addition, to further improve the feature extraction capability of the bottom shared experts, task-specific experts, and task towers, we design individual optimization objectives for the BASE model without introducing ITUs, and a balanced marginal constraint to encourage the introduction of ITU to benefit the later tasks without harming the former ones. We conducted extensive experiments on open-source large-scale recommendation datasets from AliExpress. The experimental results show that our approach significantly outperforms the mainstream MTL learning approaches for recommender systems. In addition, the ablation study demonstrates the necessity of designing core modules in MT-BICN.