Abstract: Micro-video popularity prediction (MVPP) plays a crucial role in various downstream applications. Recently, multimodal methods that integrate multiple modalities to predict the popularity have exhibited impressive performance. However, these methods face several unresolved issues: (1) limited contextual information and (2) incomplete modal semantics. Incorporating relevant videos and performing full fine-tuning on pre-trained models typically achieves powerful capabilities in addressing these issues. However, this paradigm is not optimal due to its weak transferability and scarce downstream data. Inspired by prompt learning, we propose ICPF, a novel In-Context Prompt-augmented Framework to enhance popularity prediction. ICPF maintains a model-agnostic design, facilitating seamless integration with various multimodal fusion models. Specifically, the multi-branch retriever first retrieves similar modal content through within-modality similarities. Next, in-context prompt generator extracts semantic prior features from retrieved videos and generates in-context prompts, enriching pre-trained models with valuable contextual knowledge. Finally, knowledge-augmented predictor captures complementary features including modal semantics and popularity information. Extensive experiments conducted on three real-world datasets demonstrate the superiority of ICPF compared to 14 competitive baselines.
Loading