Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Large Margin Neural Language Models
Jiaji Huang, Yi Li, Wei Ping, Sanjeev Satheesh, Gregory Diamos
Feb 15, 2018 (modified: Feb 15, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Neural language models (NLMs) are generative, and they model the distribution of grammatical sentences. Trained on huge corpus, NLMs are pushing the limit of modeling accuracy. Besides, they have also been applied to supervised learning tasks that decode text, e.g., automatic speech recognition (ASR). By re-scoring the n-best list, NLM can select grammatically more correct candidate among the list, and significantly reduce word/char error rate. However, the generative nature of NLM may not guarantee a discrimination between “good” and “bad” (in a task-specific sense) sentences, resulting in suboptimal performance. This work proposes an approach to adapt a generative NLM to a discriminative one. Different from the commonly used maximum likelihood objective, the proposed method aims at enlarging the margin between the “good” and “bad” sentences. It is trained end-to-end and can be widely applied to tasks that involve the re-scoring of the decoded text. Significant gains are observed in both ASR and statistical machine translation (SMT) tasks.
TL;DR:Enhance the language model for supervised learning task
Keywords:Language Model, discriminative model
Enter your feedback below and we'll get back to you as soon as possible.