Online Semi-Supervised Learning with Bandit Feedback

Mikhail Yurochkin; Sohini Upadhyay; Djallel Bouneffouf; Mayank Agarwal; Yasaman Khazaeni

Online Semi-Supervised Learning with Bandit Feedback

Mikhail Yurochkin, Sohini Upadhyay, Djallel Bouneffouf, Mayank Agarwal, Yasaman Khazaeni

Published: 17 Apr 2019, Last Modified: 05 May 2023LLD 2019Readers: Everyone

Keywords: online learning, graph convolutional networks, contextual bandits

TL;DR: Synthesis of GCN and LINUCB algorithms for online learning with missing feedbacks

Abstract: We formulate a new problem at the intersection of semi-supervised learning and contextual bandits, motivated by several applications including clinical trials and dialog systems. We demonstrate how contextual bandit and graph convolutional networks can be adjusted to the new problem formulation. We then take the best of both approaches to develop multi-GCN embedded contextual bandit. Our algorithms are verified on several real world datasets.

3 Replies

Loading