Abstract: Large-scale Web-based services present opportunities for improving UI policies based on observed user interactions. We address challenges of learning such policies through offline reinforcement learning (RL). Deployed in a production system for user authentication in a major social network, it significantly improves long-term objectives. We articulate practical challenges, provide insights on training and evaluation of offline RL, and discuss generalizations toward offline RL’s deployment in industry-scale applications.
External IDs:dblp:journals/ml/ApostolopoulosWWXZVZM24
Loading