RLP: A reinforcement learning benchmark for neural algorithmic reasoning

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: reinforcement learning, benchmark, algorithmic reasoning, logic puzzles
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Algorithmic reasoning is vital for problem-solving, while RL excels in certain tasks, its ability to handle complex algorithms is largely unexplored; we introduce an RL benchmark to assess this and find that RL struggles with algorithmic reasoning.
Abstract: Algorithmic reasoning is a fundamental cognitive ability that plays a pivotal role in problem-solving and decision-making processes. Although Reinforcement Learning (RL) has demonstrated remarkable proficiency in tasks such as motor control, handling perceptual input, and managing stochastic environments, its potential in learning generalizable and complex algorithms remains largely unexplored. To evaluate the current state of algorithmic reasoning in RL, we introduce an RL benchmark based on Simon Tatham's Portable Puzzle Collection. This benchmark contains 40 diverse logic puzzles of varying complexity levels, which serve as captivating challenges that test cognitive abilities, particularly in neural algorithmic reasoning. Our findings demonstrate that current RL approaches struggle with neural algorithmic reasoning, emphasizing the need for further research in this area. All of the software, including the environment, is available at https://github.com/rlppaper/rlp.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9365
Loading