Abstract: Large language models (LLMs) have become foundational to numerous natural language processing tasks; however, decoding coherent and contextually relevant text remains a complex challenge. In open-ended generation, maximizing probability is often not the appropriate objective, as with sampling methods, the continuation tends to be incoherent and repetitive in various degrees. We propose Merge Decoding, merging information in the shallow layer, such as sequential information, with the final task-specific layer, thereby generating coherent and rich text. MD works across three scales of the LLaMA family(7B, 13B, 30B), achieving higher quality text in open-ended text generation (WikiText, WikiNews, BookCorpus) and enhancing reasoning capabilities in downstream tasks (Gsm8k, StrategyQA) https://github.com/YcChou/MergeDecoding.
Loading