Saucissonnage of Long Sequences into a Multi-encoder for Neural Text Summarization with Transformers
Abstract: Transformer deep models have gained lots of attraction in Neural
Text Summarization. The problem with existing Transformer-based systems
is that they truncate documents considerably before feeding them to the net-
work. In this paper, we are particularly interested in biomedical long text sum-
marization. However, current input sequences are far shorter than the average
length of biomedical articles. To handle this problem, we propose two improve-
ments to the original Transformer model that allow a faster training of long se-
quences without penalizing the summary quality. First, we split the input be-
tween four encoders to focus attention on smaller segments of the input. Sec-
ond, we use end-chunk task training at the decoder level for progressive fast
decoding. We evaluate our proposed architecture on PubMed, a well-known
biomedical dataset. The comparison with competitive baselines shows that our
approach: (1) allows reading large input sequences, (2) reduces the training time
considerably, and (3) slightly improves the quality of generated summaries.
0 Replies
Loading