Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching

Tom Zahavy; Shie Mannor

Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching

Tom Zahavy, Shie Mannor

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: Neural-linear bandits combine linear contextual bandits with deep neural networks to solve problems where both exploration and representation learning play an important role.

Abstract: We study neural-linear bandits for solving problems where both exploration and representation learning play an important role. Neural-linear bandits leverage the representation power of deep neural networks and combine it with efficient exploration mechanisms, designed for linear contextual bandits, on top of the last hidden layer. Since the representation is being optimized during learning, information regarding exploration with "old" features is lost. Here, we propose the first limited memory neural-linear bandit that is resilient to this catastrophic forgetting phenomenon. We perform simulations on a variety of real-world problems, including regression, classification, and sentiment analysis, and observe that our algorithm achieves superior performance and shows resilience to catastrophic forgetting.

Original Pdf: pdf

8 Replies

Loading