In the paper 'Pretraining Methods for Dialog Context Representation Learning', it mentions: another related paper found that a model can capture far longer dependencies when pretrained
with a suitable auxiliary task. This paper falls in line with the second goal by creating learning objectives that improve a representation to capture general-purpose information. which you've also read. Provide the full name of that work.