ContraGen: Effective Contrastive Learning For Causal Language Model

Nihal Jain; Dejiao Zhang; Wasi Uddin Ahmad; Zijian Wang; Feng Nan; Xiaopeng Li; Ming Tan; Ramesh Nallapati; Baishakhi Ray; Parminder Bhatia; Xiaofei Ma; Bing Xiang

ContraGen: Effective Contrastive Learning For Causal Language Model

Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Bing Xiang

Published: 01 Feb 2023, Last Modified: 22 Jun 2025Submitted to ICLR 2023Readers: Everyone

Keywords: Natural Language Processing, Contrastive Learning, Causal Language Model, Natural Language Generation and Discrimination, Code Generation and Discrimination

TL;DR: An effective contrastive learning approach that enhances both the generation and discrimination capability of causal language models

Abstract: Despite exciting progress in large-scale language generation, the expressiveness of its representations is severely limited by the \textit{anisotropy} issue where the hidden representations are distributed into a narrow cone in the vector space. To address this issue, we present ContraGen, a novel contrastive learning framework to improve the representation with better uniformity and discrimination at both sequence-level and token-level. We assess ContraGen on a wide range of downstream tasks in natural and programming languages. We show that ContraGen can effectively enhance both uniformity and discrimination of the representations and lead to the desired improvement on various language understanding tasks where discriminative representations are crucial for attaining good performance. Specifically, we attain $45.9\%$ relative improvement on the Semantic Textual Similarity tasks and $33.5\%$ on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraGen also boosts the source code generation capability with $9\%$ relative improvement on execution accuracy on HumanEval benchmark.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/contragen-effective-contrastive-learning-for/code)

14 Replies

Loading