StreamMeCo: Long-Term Agent Memory Compression for Efficient Streaming Video Understanding

StreamMeCo: Long-Term Agent Memory Compression for Efficient Streaming Video Understanding

ACL ARR 2026 January Submission2483 Authors

03 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Agent Memory Compression，Streaming Video Understanding，Efficient MLLM

Abstract: Vision agent memory has shown remarkable effectiveness in long-video understanding; however, storing such memory for videos incurs substantial overhead, leading to high costs in both storage and computation. To address this issue, we propose StreamMeCo, an efficient Stream Agent Memory Compression framework. Specifically, based on the connectivity of the memory graph, StreamMeCo introduces edge-free minmax sampling for isolated nodes and edge-aware weight pruning for connected nodes, evicting redundant memory nodes while maintaining accuracy. In addition, we introduce a time-decay memory retrieval mechanism to mitigate the performance degradation caused by memory compression. Extensive experiments on three challenging benchmark datasets (M3-Bench-robot, M3-Bench-web, and Video-MME-Long) demonstrate that under 70% memory graph compression, StreamMeCo achieves a 1.87× speedup in memory retrieval while delivering an average accuracy improvement of 1.0%. Our code is available in the supplementary materials and will be released on GitHub.

Paper Type: Long

Research Area: LLM Efficiency

Research Area Keywords: Efficient/Low-Resource Methods for NLP

Contribution Types: Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 2483

Loading