Keywords: spatial reasoning, reinforcement learning, visual reasoning
TL;DR: We propose a new interactive spatial reasoning environment based on manipulating knots.
Abstract: We propose KnotGym, an interactive environment for complex, spatial reasoning and manipulation.
KnotGym includes goal-oriented rope manipulation tasks with varying levels of complexity, all requiring acting from pure image observations.
Tasks are defined along a clear and quantifiable axis of complexity based on the number of knot crossings, creating a natural generalization test.
KnotGym has a simple observation space, allowing for scalable development, yet it highlights core challenges in integrating acute perception, spatial reasoning, and grounded manipulation.
We evaluate methods of different classes, including model-based RL, model-predictive control, and chain-of-thought reasoning, and illustrate the challenges KnotGym presents.
Code URL: https://github.com/lil-lab/knotgym
Supplementary Material: zip
Primary Area: Data for Reinforcement learning (e.g., decision and control, planning, hierarchical RL, robotics)
Submission Number: 706
Loading