From Feature Interaction to Feature Generation: A Generative Paradigm of CTR Prediction Models

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Click-Through Rate (CTR) prediction models estimate the probability of users clicking on items based on feature interactions, inherently following a discriminative paradigm. However, this paradigm is prone to embedding dimensional collapse and information redundancy due to limitations of vanilla feature embeddings. This motivates us to reformulate it into a generative paradigm to generate new feature embeddings. Unlike sequential recommendation, which naturally fits a generative "next-item prediction" paradigm, it's hard to formulate CTR models into this paradigm without explicit feature order. Therefore, we propose a novel Supervised Feature Generation framework for CTR models, shifting from the discriminative "feature interaction" paradigm to the generative "feature generation" paradigm. Specifically, we predict each feature embedding based on the concatenation of all feature embeddings. Besides, this paradigm naturally accommodates a supervised binary cross-entropy loss to indicate whether the sample is positive or negative. The framework can reformulate nearly every existing CTR model and bring significant performance lifts. Moreover, it produces less-collapsed and redundancy-reduced feature embeddings, thereby mitigating the inherent limitations of the discriminative paradigm. The code can be found at https://github.com/USTC-StarTeam/GE4Rec.
Lay Summary: CTR prediction models predict the probability of a user clicking on an item based on the direct interaction of a set of features. However, we have identified that this direct interaction results in a limitation of the information abundance of the learned features. We hypothesized that this issue could be addressed by avoiding direct interactions. Instead of interacting those features, we propose to generate them using a new paradigm. Our paper presents results that demonstrate this simple yet effective new paradigm can significantly enhance the information abundance of the learned features, as it can bypass direct feature interactions through a feature generation process. Our findings have revealed the inherent limitations of traditional CTR models. However, these models can be easily reformulated using our feature generation paradigm to increase the information abundance of learned features.
Link To Code: https://github.com/USTC-StarTeam/GE4Rec
Primary Area: Applications
Keywords: Generative model, Recommender system, Feature interaction
Submission Number: 1748
Loading