Improving Unsupervised Sentence Simplification Using Fine-Tuned Masked Language ModelsDownload PDF

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone
Abstract: Word suggestion in unsupervised sentence simplification is mostly done without considering the context of the input sentence. Fortunately, masked language modeling is a well-established task for predicting the most suitable candidate for a masked token using the surrounding context words. In this paper, we propose a technique that merges pre-trained BERT models with a successful edit-based unsupervised sentence simplification model to bring context-awareness into the simple word suggestion functionality. Next, we show that only by fine-tuning the BERT model on enough simplistic sentences, simplification results can be improved and even outperform some of the competing supervised methods. Finally, we introduce a framework that involves filtering an arbitrary amount of unlabeled in-domain texts for tuning the model. By removing useless training samples, this preprocessing step speeds up the fine-tuning process where labeled data, as simple and complex, are scarce.
Paper Type: long
0 Replies

Loading