Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching

Anonymous

Sep 25, 2019 ICLR 2020 Conference Blind Submission readers: everyone Show Bibtex
  • TL;DR: Neural-linear bandits combine linear contextual bandits with deep neural networks to solve problems where both exploration and representation learning play an important role.
  • Abstract: We study neural-linear bandits for solving problems where both exploration and representation learning play an important role. Neural-linear bandits leverage the representation power of deep neural networks and combine it with efficient exploration mechanisms, designed for linear contextual bandits, on top of the last hidden layer. Since the representation is being optimized during learning, information regarding exploration with "old" features is lost. Here, we propose the first limited memory neural-linear bandit that is resilient to this catastrophic forgetting phenomenon. We perform simulations on a variety of real-world problems, including regression, classification, and sentiment analysis, and observe that our algorithm achieves superior performance and shows resilience to catastrophic forgetting.
0 Replies

Loading