Historical Text Normalization with Delayed Rewards

Simon Flachs, Marcel Bollmann, Anders Søgaard

2019 (modified: 04 Nov 2022)ACL (1) 2019Readers: Everyone

Abstract: Training neural sequence-to-sequence models with simple token-level log-likelihood is now a standard approach to historical text normalization, albeit often outperformed by phrase-based models. Policy gradient training enables direct optimization for exact matches, and while the small datasets in historical text normalization are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. Policy gradient training, in particular, leads to more accurate normalizations for long or unseen words.

0 Replies