In the realm of Large Language Models (LLMs), the ability to process long contexts is increasingly crucial for tasks such as multi-round dialogues, code generation, and document summarization. This paper addresses the challenges of achieving high long-context performance, low computational complexity, and compatibility with pretrained models -- collectively termed the ``impossible triangle''. We introduce E2LLM (Encoder Elongated Large Language Models), a novel approach that effectively navigates this paradox. The method involves splitting long contexts into chunks, compressing each into soft prompts via a pretrained text encoder, and utilizing an adapter to align these representations with a decoder-only LLM. To further enhance the LLM's understanding and reasoning capabilities regarding the soft prompts, we implement two training objectives: one focused on reconstructing the encoder output and the other on long-context instruction fine-tuning. Extensive experiments including Needle in a Haystack and LongBench reveal that E2LLM not only outperforms seven existing state-of-the-art (SOTA) methods across various long-context tasks, but also achieves the lowest inference time and memory usage. Code will be available upon publication.
Keywords: Long-Context Modeling, Large Language Models, Encoder, Decoder
TL;DR: E2LLM enhances long-context processing splitting long contexts into chunks, compressing them into embedding vectors, and utilizing an adapter to align these representations with a decoder-only LLM.
Abstract:
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3872
Loading