UniMem: Towards a Unified View of Long-Context Large Language Models

Junjie Fang; Likai Tang; Hongzhe Bi; Yujia Qin; Si Sun; Zhenyu Li; Haolun Li; Yongjian Li; Xin Cong; Yankai Lin; Yukun Yan; Xiaodong Shi; Sen Song; Zhiyuan Liu; Maosong Sun

UniMem: Towards a Unified View of Long-Context Large Language Models

Junjie Fang, Likai Tang, Hongzhe Bi, Yujia Qin, Si Sun, Zhenyu Li, Haolun Li, Yongjian Li, Xin Cong, Yankai Lin, Yukun Yan, Xiaodong Shi, Sen Song, Zhiyuan Liu, Maosong Sun

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0

Research Area: Compute efficient LMs, Engineering for large LMs, Learning algorithms for LMs, Inference algorithms for LMs

Keywords: long-context, large language models

TL;DR: The article introduces UniMem, a unified framework that reformulates existing long-context methods from the view of memory augmentation of large language models (LLMs).

Abstract: Long-context processing is a critical ability that constrains the applicability of large language models (LLMs). Although there exist various methods devoted to enhancing the long-context processing ability of LLMs, they are developed in an isolated manner and lack systematic analysis and integration of their strengths, hindering further developments. In this paper, we introduce UniMem, a Unified framework that reformulates existing long-context methods from the view of Memory augmentation of LLMs. Distinguished by its four core dimensions—Memory Management, Memory Writing, Memory Reading, and Memory Injection, UniMem empowers researchers to conduct systematic exploration of long-context methods. We re-formulate 16 existing methods based on UniMem and analyze four representative methods: Transformer-XL, Memorizing Transformer, RMT, and Longformer into equivalent UniMem forms to reveal their design principles and strengths. Based on these analyses, we propose UniMix, an innovative approach that integrates the strengths of these algorithms. Experimental results show that UniMix achieves superior performance in handling long contexts with significantly lower perplexity than baselines. The code is publicly available at https://github.com/thunlp/UniMem.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 1352

Loading