Abstract: Linguistically informed features are provably useful in classifying implicit discourse relations among adjacent text spans. However the state of the art methods in this area suffer from either sparse natively implicit relation corpus or counter-intuitive artificially implicit one, and consequently either insufficient or distorted training in automatically learning discriminative features. To overcome the problem, this paper proposes a semantic frame based vector model towards unsupervised acquisition of semantically and relationally parallel data, aiming to enlarge natively implicit relation corpus so as to optimize the training effect. Experiments on PDTB 2.0 show the usage of the acquired parallel corpus gives statistically significant improvements over that of the prototypical corpus.
Loading