Context-Aware Temperature for Language Modeling Download PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone
Keywords: natural language processing, language modeling, sequence modeling, temperature scaling
Abstract: Current practices to apply temperature scaling assume either a fixed, or a manually-crafted dynamically changing schedule. However, our studies indicate that the individual optimal trajectory for each class can change with the context. To this end, we propose context-aware temperature, a generalized approach to provide an individual optimal temperature trajectory over the context for each vocabulary, while allowing the temperature to be learned along with the remaining model parameters during training. Experiment results confirm that the proposed method significantly improves state-of-the-art language models, achieving a perplexity of 19.90 on Penn Treebank, 33.88 on WikiText-2, and 4.7 on WikiText-103.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We propose context-aware temperature, a mechanism that enables temperature scaling for language models based on the context of each word.
Reviewed Version (pdf): https://openreview.net/references/pdf?id=dUfrxjHT7w
1 Reply

Loading