TL;DR: An empirical study that examines the effectiveness of different encoder-decoder combinations for the task of dependency parsing
Abstract: Graph-based dependency parsing consists of two steps: first, an encoder produces a feature representation for each parsing substructure of the input sentence, which is then used to compute a score for the substructure; and second, a decoder} finds the parse tree whose substructures have the largest total score. Over the past few years, powerful neural techniques have been introduced into the encoding step which substantially increases parsing accuracies. However, advanced decoding techniques, in particular high-order decoding, have seen a decline in usage. It is widely believed that contextualized features produced by neural encoders can help capture high-order decoding information and hence diminish the need for a high-order decoder. In this paper, we empirically evaluate the combinations of different neural and non-neural encoders with first- and second-order decoders and provide a comprehensive analysis about the effectiveness of these combinations with varied training data sizes. We find that: first, when there is large training data, a strong neural encoder with first-order decoding is sufficient to achieve high parsing accuracy and only slightly lags behind the combination of neural encoding and second-order decoding; second, with small training data, a non-neural encoder with a second-order decoder outperforms the other combinations in most cases.
Keywords: dependency parsing, high order decoding, empirical study
Original Pdf: pdf
3 Replies
Loading