Saucissonnage of Long Sequences into a Multi-encoder for Neural Text Summarization with Transformers

Jessica Lopez Espejel, Gaël de Chalendar, Jorge Garcia Flores, Ivan Vladimir Meza Ruiz, Thierry Charnois

15 Oct 2021OpenReview Archive Direct UploadReaders: Everyone

Abstract: Transformer deep models have gained lots of attraction in Neural Text Summarization. The problem with existing Transformer-based systems is that they truncate documents considerably before feeding them to the net- work. In this paper, we are particularly interested in biomedical long text sum- marization. However, current input sequences are far shorter than the average length of biomedical articles. To handle this problem, we propose two improve- ments to the original Transformer model that allow a faster training of long se- quences without penalizing the summary quality. First, we split the input be- tween four encoders to focus attention on smaller segments of the input. Sec- ond, we use end-chunk task training at the decoder level for progressive fast decoding. We evaluate our proposed architecture on PubMed, a well-known biomedical dataset. The comparison with competitive baselines shows that our approach: (1) allows reading large input sequences, (2) reduces the training time considerably, and (3) slightly improves the quality of generated summaries.

0 Replies