Abstract: Evaluating quality of text simplification (TS) is challenging, especially for models following unsupervised or reinforcement learning techniques, where reference data is unavailable. We introduce CEScore, a novel statistical model that evaluates TS quality without relying on references. By mimicking the way humans evaluate TS, CEScore provides 4 metrics ($S_{score}$, $G_{score}$, $M_{score}$, and $CE_{score}$) to assess simplicity, grammaticality, meaning preservation, and overall quality, respectively. In an experiment with 26 TS models, CEScore correlates strongly with human evaluations, achieving 0.98 in Spearman correlations at model-level. This underscores the potential of CEScore as a simple and efficient metric for assessing the quality of TS models
Paper Type: long
Research Area: Resources and Evaluation
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models, Data analysis
Languages Studied: English
0 Replies
Loading