- Abstract: Recent years have seen remarkable progress of text generation in different contexts, including the most common setting of generating text from scratch, the increasingly popular paradigm of retrieval and editing, and others. Text infilling, which fills missing text portions of a sentence or paragraph, is also of numerous use in real life. Previous work has focused on restricted settings, by either assuming single word per missing portion, or limiting to single missing portion to the end of text. This paper studies the general task of text infilling, where the input text can have an arbitrary number of portions to be filled, each of which may require an arbitrary unknown number of tokens. We develop a self-attention model with segment-aware position encoding for precise global context modeling. We further create a variety of supervised data by masking out text in different domains with varying missing ratios and mask strategies. Extensive experiments show the proposed model performs significantly better than other methods, and generates meaningful text patches.
- Keywords: text generation, text infilling, self attention, sequence to sequence
- TL;DR: We study a general task of text infilling that fills missing portions of given text; an self-attention model is developed.