\begin{abstract}
Deep latent variable models such as variational autoencoders and energy-based models are widely used for neural text generation. Most of them focus on matching the prior distribution with the posterior distribution of the latent variable for text reconstruction. In addition to instance-level reconstruction, this paper aims to integrate contrastive learning in the latent space, forcing the latent variables to learn high-level semantics by exploring inter-instance relationships. Experiments on various text generation benchmarks show the effectiveness of our proposed method. We also empirically show that our method can mitigate the posterior collapse issue for latent variable based text generation models. 
\end{abstract}