Abstract: Conditional variational auto-encoders (CVAEs) represent a powerful deep generative framework, utilizing latent variables (explicitly modeled hidden states) to capture underlying factors and govern the generation process accordingly. However, this idea is less explored in the era of large language models (LLMs), facing challenges in structural differences between LLMs and traditional CVAEs as well as challenges in posterior collapse (homogeneous latent variables). In this work, we present the first attempt to extend decoder-only LLMs into encoder–decoder CVAEs, aiming at enhancing existing LLMs with flexible control via low-dimensional latent vectors. To achieve this, we introduce a novel optimization objective for effective latent variable modeling and propose a gradient-only skip (G-Skip) connection, which jointly enhances generation controllability while preserving generation quality. Through experiments on AGNews, Yelp, and DailyDialog, we validate the effectiveness of our method in achieving latent modeling and latent-guided language generation on the basis of Llama3-8B. Specifically, we establish new state-of-the-art performance in dialogue generation on the DailyDialog dataset, achieving a BERTScore of 88.30 and a FED score of 5.49.
External IDs:dblp:journals/tai/ZhangLCLLLR26
Loading