Structure-Aware Path Inference for Neural Finite State Transducers

NeurIPS 2023 Workshop ICBINB Submission12 Authors

Published: 27 Oct 2023, Last Modified: 21 Dec 2023ICBINB 2023EveryoneRevisionsBibTeX
Keywords: latent variable model, amortized variational approximation, proposal distribution
Abstract: Finite-state transducers (FSTs) are a traditional approach to string-to-string mapping. Each FST path specifies a possible alignment of input and output strings. Compared to an unstructured seq2seq model, the FST includes an explicit latent alignment variable and equips it with domain-specific hard constraints and featurization, which can improve generalization from small training sets. Previous work has shown how to score the FST paths with a trainable neural architecture; this improves the model's expressive power by dropping the usual Markov assumption but makes inference more difficult for the same reason. In this paper, we focus on the resulting challenge of imputing the latent alignment path that explains a given pair of input and output strings (e.g. during training). We train three autoregressive approximate models for amortized inference of the path, which can then be used as proposal distributions for importance sampling. All three models perform lookahead. Our most sophisticated (and novel) model leverages the FST structure to consider the graph of future paths; unfortunately, we find that it loses out to the simpler approaches---except on an \emph{artificial} task that we concocted to confuse the simpler approaches.
Submission Number: 12