Arabic Multiword ExpressionsOpen Website

Published: 2014, Last Modified: 03 Oct 2023Language, Culture, Computation (3) 2014Readers: Everyone
Abstract: In this work we address the problem of automatic multiword expression identification and classification in Arabic running text. We propose a supervised machine learning approach using a relatively small manually annotated data augmented with an increasing size of automatically tagged data, labeled using a deterministic pattern-matching algorithm. In particular, in this chapter, we show the impact of explicitly modeling morpho-syntactic features calculated on the detection task. Moreover, we present the first work to address the problem of handling gapped verb-noun constructions in running text. We show that using the syntactic construction classes as labels improves identification results for verb-noun and verb-particle constructions. Our best identification algorithm yields an F-measure of 61.4%, which is a significant improvement over our baseline of 48.8%.
0 Replies

Loading