Deep Expectation-Maximization in Hidden Markov Models via Simultaneous Perturbation Stochastic Approximation

Chong Li; Dan Shen; C.J. Richard Shi; Hongxia Yang

Deep Expectation-Maximization in Hidden Markov Models via Simultaneous Perturbation Stochastic Approximation

Chong Li, Dan Shen, C.J. Richard Shi, Hongxia Yang

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: recommender system, gradient approximation, Hidden Markov Model

TL;DR: We rendered Expectation-Maximization iteration as a network layer by approximating its gradient.

Abstract: We propose a novel method to estimate the parameters of a collection of Hidden Markov Models (HMM), each of which corresponds to a set of known features. The observation sequence of an individual HMM is noisy and/or insufficient, making parameter estimation solely based on its corresponding observation sequence a challenging problem. The key idea is to combine the classical Expectation-Maximization (EM) algorithm with a neural network, while these two are jointly trained in an end-to-end fashion, mapping the HMM features to its parameters and effectively fusing the information across different HMMs. In order to address the numerical difficulty in computing the gradient of the EM iteration, simultaneous perturbation stochastic approximation (SPSA) is employed to approximate the gradient. We also provide a rigorous proof that the approximated gradient due to SPSA converges to the true gradient almost surely. The efficacy of the proposed method is demonstrated on synthetic data as well as a real-world e-Commerce dataset.

Original Pdf: pdf

5 Replies

Loading