Personalization for web-based services using offline reinforcement learning

Pavlos Athanasios Apostolopoulos, Zehui Wang, Hanson Wang, Tenghyu Xu, Chad Zhou, Kittipat Virochsiri, Norm Zhou, Igor L. Markov

Published: 2024, Last Modified: 12 Aug 2025Mach. Learn. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Large-scale Web-based services present opportunities for improving UI policies based on observed user interactions. We address challenges of learning such policies through offline reinforcement learning (RL). Deployed in a production system for user authentication in a major social network, it significantly improves long-term objectives. We articulate practical challenges, provide insights on training and evaluation of offline RL, and discuss generalizations toward offline RL’s deployment in industry-scale applications.

External IDs:dblp:journals/ml/ApostolopoulosWWXZVZM24