E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning

Zihan Liao; Jun Wang; Hang Yu; Lingxiao Wei; Jianguo Li; Jun Wang; Wei Zhang

E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning

Zihan Liao, Jun Wang, Hang Yu, Lingxiao Wei, Jianguo Li, Jun Wang, Wei Zhang

24 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Long-Context Modeling, Large Language Models, Encoder, Decoder

TL;DR: E2LLM enhances long-context processing splitting long contexts into chunks, compressing them into embedding vectors, and utilizing an adapter to align these representations with a decoder-only LLM.

Abstract: In the realm of Large Language Models (LLMs), the ability to process long contexts is increasingly crucial for tasks such as multi-round dialogues, code generation, and document summarization. This paper addresses the challenges of achieving high long-context performance, low computational complexity, and compatibility with pretrained models -- collectively termed the ``impossible triangle''. We introduce E2LLM (Encoder Elongated Large Language Models), a novel approach that effectively navigates this paradox. The method involves splitting long contexts into chunks, compressing each into soft prompts via a pretrained text encoder, and utilizing an adapter to align these representations with a decoder-only LLM. To further enhance the LLM's understanding and reasoning capabilities regarding the soft prompts, we implement two training objectives: one focused on reconstructing the encoder output and the other on long-context instruction fine-tuning. Extensive experiments including Needle in a Haystack and LongBench reveal that E2LLM not only outperforms seven existing state-of-the-art (SOTA) methods across various long-context tasks, but also achieves the lowest inference time and memory usage. Code will be available upon publication.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3872

Loading