CoT: Cooperative Training for Generative Modeling of Discrete Data

Sidi Lu; Lantao Yu; Siyuan Feng; Yaoming Zhu; Weinan Zhang; Yong Yu

CoT: Cooperative Training for Generative Modeling of Discrete Data

Sidi Lu, Lantao Yu, Siyuan Feng, Yaoming Zhu, Weinan Zhang, Yong Yu

27 Sept 2018 (modified: 22 Jun 2025)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: We propose Cooperative Training (CoT) for training generative models that measure a tractable density for discrete data. CoT coordinately trains a generator G and an auxiliary predictive mediator M. The training target of M is to estimate a mixture density of the learned distribution G and the target distribution P, and that of G is to minimize the Jensen-Shannon divergence estimated through M. CoT achieves independent success without the necessity of pre-training via Maximum Likelihood Estimation or involving high-variance algorithms like REINFORCE. This low-variance algorithm is theoretically proved to be superior for both sample generation and likelihood prediction. We also theoretically and empirically show the superiority of CoT over most previous algorithms in terms of generative quality and diversity, predictive generalization ability and computational cost.

Keywords: Generative Models, Sequence Modeling, Text Generation

TL;DR: We proposed Cooperative Training, a novel training algorithm for generative modeling of discrete data.

Code: [![github](/images/github_icon.svg) desire2020/Cooperative-Training](https://github.com/desire2020/Cooperative-Training) + [![Papers with Code](/images/pwc_icon.svg) 1 community implementation](https://paperswithcode.com/paper/?openreview=SkxxIs0qY7)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/cot-cooperative-training-for-generative/code)

34 Replies

Loading