Keywords: Neural language models, Unconditional Text Generation, Transformer
Abstract: Our goal is to adapt pre-trained neural language models (NLMs) to the unconditional text generation task within the target domain.
Because many Transformer based NLMs are trained on more massive and heterogeneous corpora than this target domain,
the difference between these corpora and the target domain raises the question of whether these NLMs can provide their benefits to this task even after the fine-tuning.
To tackle these problems, our approach focuses on topics to bridge the semantic gap between these corpora and the target domain corpus,
and relates them at a topic level.
That is, this approach injects topics into these NLMs and trains them via topics behind these dependencies over segments,
introducing both topic alignment (TA) and training tasks (TDM and TEM),
while previous Transformer based NLMs are better at learning from the predefined segment length such as the context.
Experiments show that this approach contributes to resolve the imbalance between these corpora,
and can tailor previous pre-trained NLMs to generate coherent and semantically valid text reflecting a given small fine-tuning corpus.
One-sentence Summary: Our goal is to adapt pre-trained neural language models (NLMs) to the text generation task within the target domain.
5 Replies
Loading