Hyper-multi-step: The Truth Behind Difficult Long-context Tasks

Yijiong Yu; Ma Xiufa; Jianwei Fang; Zhi Xu; Su Guangyao; Jiancheng Wang; Yongfeng Huang; Zhixiao Qi; Wei Wang; weifeng.liu; Ran Chen; Ji Pei

Hyper-multi-step: The Truth Behind Difficult Long-context Tasks

Yijiong Yu, Ma Xiufa, Jianwei Fang, Zhi Xu, Su Guangyao, Jiancheng Wang, Yongfeng Huang, Zhixiao Qi, Wei Wang, weifeng.liu, Ran Chen, Ji Pei

27 Sept 2024 (modified: 12 Mar 2026)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, long context, reasoning, multi-step, benchmark, CoT, attention mechanism, linear probing

Abstract: Long-context language models (LCLM), characterized by their extensive context window, is becoming increasingly popular. Meanwhile, many long-context benchmarks present challenging tasks that even the most advanced LCLMs struggle to complete. However, the underlying sources of various challenging long-context tasks have seldom been studied. To bridge this gap, we conduct experiments to indicate their difficulty stems primarily from two basic issues: "multi-matching retrieval," which requires the simultaneous retrieval of multiple items, and "logic-based retrieval," which necessitates logical judgment within retrieval criteria. These two problems, while seemingly straightforward, actually exceed the capabilities of LCLMs because they are proven to be hyper-multi-step (demanding numerous steps to solve) in nature. This finding could explain why LLMs struggle with more advanced long-context tasks, providing a more accurate perspective for rethinking solutions for them.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9529

Loading