Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning

Published: 28 Feb 2025, Last Modified: 02 Mar 2025WRL@ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: full paper
Keywords: Memory-based RL, Robotics, Memory, POMDP, Benchmark, Tabletop Manipulation
TL;DR: Classification of memory tasks in RL by type of memory usage, a benchmark to test the memory of an RL agent and a benchmark of 32 memory tasks for tabletop robotic manipulation.
Abstract: Memory is crucial for enabling agents to tackle complex tasks with temporal and spatial dependencies. While many reinforcement learning (RL) algorithms incorporate memory, the field lacks a universal benchmark to assess an agent's memory capabilities across diverse scenarios. This gap is particularly evident in tabletop robotic manipulation, where memory is essential for solving tasks with partial observability and ensuring robust performance, yet no standardized benchmarks exist. In this work, we address these challenges through three key contributions: (1) we propose a comprehensive classification framework for memory-intensive RL tasks, (2) we collect MIKASA -- a unified benchmark that enables systematic evaluation of memory-enhanced agents across diverse scenarios, and (3) we develop ManiSkill-Memory -- a novel benchmark of 32 carefully designed memory-intensive tasks that assess memory capabilities in tabletop robotic manipulation. Our contributions establish a unified framework for advancing memory RL research, driving the development of more reliable systems for real-world applications.
Supplementary Material: zip
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Egor_Cherepanov1
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 67
Loading