Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation

Anonymous

Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: In this work, we propose a novel approach called Distillation Contrastive Decoding to enhance the reasoning capabilities of Large Language Models (LLMs) during inference. Different from previous approaches that used smaller amateur models or analyzed differences in hidden states, DCD leverages contrastive chain-of-thought prompting and advanced distillation techniques, such as Dropout and Quantization, to address the limitations of Contrastive Decoding, which often require both an expert and an amateur model, thereby increasing computational demands. By integrating contrastive prompts with distillation, DCD obviates the need for an amateur model and reduces memory usage. Our evaluations show that DCD significantly improves LLM performance across various reasoning benchmarks, outperforming existing methods and achieving state-of-the-art results in both GSM8K and StrategyQA.

Paper Type: long

Research Area: Generation

Contribution Types: NLP engineering experiment

Languages Studied: English

0 Replies

Loading