Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions

Published: 13 Dec 2020, Last Modified: 24 Oct 2024Joint Workshop on Multiword Expressions and Electronic LexiconsEveryoneCC BY 4.0
Abstract:

We present edition 1.2 of the PARSEME shared task on identification of verbal multiword ex- pressions (VMWEs). Lessons learned from previous editions indicate that VMWEs have low ambiguity, and that the major challenge lies in identifying test instances never seen in the train- ing data. Therefore, this edition focuses on unseen VMWEs. We have split annotated corpora so that the test corpora contain around 300 unseen VMWEs, and we provide non-annotated raw cor- pora to be used by complementary discovery methods. We released annotated and raw corpora in 14 languages, and this semi-supervised challenge attracted 7 teams who submitted 9 system re- sults. This paper describes the effort of corpus creation, the task design, and the results obtained by the participating systems, especially their performance on unseen expressions.

Loading