Abstract: Addressing the limitation of context length in large language models for code-related tasks is the primary focus of this paper. Existing LLMs are constrained by their pre-trained context lengths, leading to performance issues in handling long complex code sequences. Inspired by how human programmers navigate code, we introduce Hierarchical Rotary Position Embedding (HiRoPE), a novel approach that enhances the traditional rotary position embedding into a hierarchical format based on the hierarchical structure of source code. HiRoPE offers easy integration into existing LLMs without extra training costs. Our method is extensively evaluated with various LLMs, demonstrating stable performance in tasks such as language modeling and long code completion. We also introduce a new long code understanding task with real-world code projects, in hopes of promoting further development in this code-related field. Theoretically and experimentally, we find that HiRoPE also addresses the out-of-distribution issue in position encoding. Our HiRoPE significantly expands the context length capabilities of LLMs, enabling inference at lengths exponentially greater than the training length.
Paper Type: long
Research Area: NLP Applications
Contribution Types: NLP engineering experiment
Languages Studied: English
Preprint Status: We plan to release a non-anonymous preprint in the next two months (i.e., during the reviewing process).
A1: yes
A1 Elaboration For Yes Or No: Limitation Section
A2: yes
A2 Elaboration For Yes Or No: Limitation Section
A3: yes
A3 Elaboration For Yes Or No: Section 1
B: yes
B1: yes
B1 Elaboration For Yes Or No: Section 4
B2: n/a
B3: n/a
B4: n/a
B5: yes
B5 Elaboration For Yes Or No: Section 4
B6: yes
B6 Elaboration For Yes Or No: Section 4
C: yes
C1: yes
C1 Elaboration For Yes Or No: Section 4
C2: yes
C2 Elaboration For Yes Or No: Section 4
C3: no
C3 Elaboration For Yes Or No: Section 4: We use the greedy search to avoid randomness.
C4: yes
C4 Elaboration For Yes Or No: Section 4: tree-sitter
D: no
D1: n/a
D2: n/a
D3: n/a
D4: n/a
D5: n/a
E: no
E1: n/a
0 Replies
Loading