Abstract: We propose a multi-task learning approach to reducing text complexity which combines text summarization and simplification methods. For the purposes of this research, two datasets were used: the Simple English Wikipedia dataset for simplification and the CNN/DailyMail dataset for summarization. We describe several experiments with reducing text complexity. One experiment consists in first training the model on summarization data, then fine-tuning it on simplification data. Another experiment involves training the model on both datasets simultaneously while augmenting source texts with a task-specific tag that shows the model which task (summarization or simplification) needs to be performed on a given text. Models with a similar architecture were also trained on each dataset separately for comparison. Our experiments have shown that the multi-task learning approach with task-specific tags is more effective than the fine-tuning approach, and the models trained for both tasks simultaneously can perform as good at each of them as the models that were trained only for that specific task.
0 Replies
Loading