Text Complexity And Linguistic Features: Is The Relationship Multilingual?Download PDF


16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Text complexity assessment is a challenging task requiring various linguistic aspects to be taken into consideration. A large number of studies have been introduced in this field. Nevertheless, as the methods and corpora are quite diverse, it may be hard to draw general conclusions as to the effectiveness of linguistic information for evaluating text complexity due to the diversity of methods and corpora. Moreover, a cross-lingual impact of different features on various datasets has not been investigated. We experimentally assessed seven commonly used feature types on six corpora for text complexity employing four common machine learning models. We showed which feature types can significantly improve the performance and analyzed their impact according to the dataset characteristics, language, and origin.
0 Replies
