Keywords: recommender systems, calibration
Abstract: Accurate click-through and conversion-rate estimates are pivotal for bid optimization in large-scale advertising, yet modern deep CTR/CVR models are often miscalibrated. Classical global calibrators (Platt scaling, isotonic regression) and feature-based binning struggle to capture latent user–item heterogeneity. We approach calibration through the lens of latent, calibration-aware groupings and propose Variance-Reduced Semantic-Aware Grouping (VR-SAG)}—a lightweight post-hoc layer over a frozen backbone that (i) forms semantically coherent partitions in embedding space, (ii) fits per-group temperature+bias calibrators, and (iii) explicitly penalizes intra-group variance to tighten probability spreads.
Our design is grounded in a group-wise decomposition of proper scoring rules (e.g., Brier), which isolates intra-group variance as a key driver of residual miscalibration and motivates variance control for genuine loss reduction. To decouple evaluation from training, we introduce Logit-Cluster Calibration Error (LCCE), an unsupervised fixed-partition metric obtained via $K$-means in logit space; LCCE aligns with the reliability term of proper scores while avoiding pitfalls of trainable grouping heads used as metrics.
Across large-scale offline logs and AdAuction—a large-scale ad-auction dataset with oracle CTRs generated by an internal ad-auction simulator—VR-SAG consistently improves calibration (ECE/LCCE and Brier variants) over strong baselines, with negligible latency and memory overhead.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 12969
Loading