CD-Pos: Long Context Generalization in LLMs Through Continuous and Discrete Position Synthesis

Published: 18 Jun 2024, Last Modified: 16 Jul 2024LCFM 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Long Context Generalization, Position Synthesis
Abstract: Large language models are critical for natural language processing and multi-modal tasks, but face challenges in tasks requiring long context windows due to computational and memory limitations. Existing methods to extend these windows are resource intensive. The proposed Continuous and Discrete Position Synthesis (CD-Pos) addresses these issues by using synthesized position indices to expand context windows efficiently. CD-Pos divides sequences into segments with continuous indices, enhancing token distance and preserving local information. Empirical evaluations show that CD-Pos effectively extends context windows up to 128k while maintaining LLMs' performance in general tasks.
Submission Number: 1