Learning from Complaints: Adversarial Disentanglement for Robust Scalper Detection in E-Commerce Promotions

Learning from Complaints: Adversarial Disentanglement for Robust Scalper Detection in E-Commerce Promotions

23 Apr 2026 (modified: 25 Apr 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Identifying scalpers in e-commerce promotions is a critical challenge where instance- dependent label noise is pervasive: legitimate users with ambiguous patterns (e.g., frequent on-the-hour purchases of high-subsidy items and orders shipped to non-habitual addresses) are often misclassified as scalpers, leading to some user complaints and operational cost. This issue is further amplified in real-time risk control, where model iteration largely relies on historical review/penalty labels, forming a closed-loop supervision that reinforces false positives as positives over time. Existing noise-handling methods (e.g., reweighting or filter- ing) largely treat such errors as random noise and fail to address the root cause—intrinsic feature overlap between scalpers and certain normal users. We propose GUARD (Grounded User-feedback Adversarial Representation Disentanglement), a complaint-aware framework that learns risk-predictive represen- tations while being insensitive to complaint-triggering superficial cues. Here, grounded means the adversarial supervision is anchored in complaint-verified false positives, rather than raw complaints. GUARD defines a Confusion Domain from these verified cases and uses it as direct supervision for a GRL-based adversarial objective, encouraging the encoder to be invariant to Confusion-Domain membership while remaining predictive of scalper risk. The model is trained in a multi-task manner with a primary risk head (reliable enforcement labels) and an adversarial confusion head. To mitigate the scarcity and bias of verified complaints, we expand the Confusion Domain via MC Dropout uncertainty sampling, mining potential false-positive candidates from a large pool of processed candidate orders, while filtering out high-confidence scalpers using existing high-precision blacklist rules to reduce contamination. We evaluate GUARD on a large-scale e-commerce promotion platform. In a 14-day online A/B test with thresholds calibrated to match recall, GUARD improves precision by +8.9 points and reduces the complaint rate by 13.5%, while keeping subsidy loss statistically unchanged. GUARD is deployed in production now.

Submission Type: Long submission (more than 12 pages of main content)

Assigned Action Editor: ~Emanuele_Sansone1

Submission Number: 8574

Loading