Expectation Maximization for Weakly Labeled Data

2001 (modified: 16 Jul 2019)ICML 2001Readers: Everyone
Abstract: We call data weakly labeled if it has no exact label but rather a numerical indication of correctness of the label “guessed” by the learning algorithm a situation commonly encountered in problems of reinforcement learning. The term emphasizes similarities of our approach to the known techniques of solving unsupervised and transductive problems. In this paper we present an on-line algorithm that casts the problem as a multi-arm bandit with hidden state and solves it iteratively within the Expectation-Maximization framework. The hidden state is represented by a parameterized probability distribution over states tied to the reward. The parameterization is formally justified, allowing for smooth blending between likelihoodand reward-based costs.
0 Replies

Loading