SELF: Self-Extend the Context Length With Logistic Growth Function

SELF: Self-Extend the Context Length With Logistic Growth Function

ACL ARR 2025 May Submission3999 Authors

19 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models suffer issues when operated on long contexts that is larger than their training context length due to the standard position encoding for tokens in the attention layer. Tokens a long distance apart will rarely have affect on each other and long prompts yield unexpected results. To solve this problem, we propose SELF (Self-Extend the Context Length With Logistic Growth Function): a solution of grouping consecutive tokens at varying group sizes using a logistic capacity equation combined with a constant group size at smaller relative distances. Our model had an average increase of performance compared to base models in LEval of 3.2\% and had an average increase of 9.1\% on the LongBench benchmark. On summarization related tasks in LongBench, our model performed 3.46\% better than the base model. On reading comprehension tasks from LEval, our model performed 11.76\% better than the base model. Our code is available at anonymous.4open.science/r/SELF-LLM-7705

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: Context window length extension, Positional embedding, Long context

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 3999

Loading