Think Once, Reuse Smartly: Bio-Inspired Memory for Efficient Vision-Language Reasoning in Autonomous Driving

Think Once, Reuse Smartly: Bio-Inspired Memory for Efficient Vision-Language Reasoning in Autonomous Driving

ACL ARR 2026 January Submission959 Authors

26 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Vision-Language Models, Autonomous Driving, Memory-Driven Inference

Abstract: Vision-Language Models (VLMs) are increasingly vital for robust decision-making in autonomous driving, yet their deep reasoning creates a critical latency bottleneck, making them impractical for real-world deployment. Current approaches accelerate inference by pruning input data, but they overlook the primary source of inefficiency: the constant, wasteful re-computation of reasoning that remains valid across consecutive frames. We introduce MEMO-VLM, a memory-driven framework inspired by human cognition that eliminates this redundant reasoning. Instead of re-generating its entire reasoning, MEMO-VLM treats previous conclusions as a hypothesis to be validated against new visual evidence, intelligently reusing what remains true and surgically updating only what has changed. This is achieved with a plug-and-play, two-stage approach that requires no VLM retraining, making it a broadly applicable solution. Experiments demonstrate that MEMO-VLM accelerates inference by up to 4.3$\times$. By bridging bio-inspired memory with computational efficiency, our work offers a practical path to deploying the advanced reasoning of VLMs in safety-critical autonomous systems.

Paper Type: Long

Research Area: LLM Efficiency

Research Area Keywords: LLM Efficiency, NLP in resource-constrained settings

Contribution Types: Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 959

Loading