Detecting danger in gridworlds using Gromov’s Link Condition

Published: 06 Dec 2023, Last Modified: 06 Dec 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Gridworlds have been long-utilised in AI research, particularly in reinforcement learning, as they provide simple yet scalable models for many real-world applications such as robot navigation, emergent behaviour, and operations research. We initiate a study of gridworlds using the mathematical framework of reconfigurable systems and state complexes due to Abrams, Ghrist & Peterson. State complexes, a higher-dimensional analogue of state graphs, represent all possible configurations of a system as a single geometric space, thus making them conducive to study using geometric, topological, or combinatorial methods. The main contribution of this work is a modification to the original Abrams, Ghrist & Peterson setup which we introduce to capture agent braiding and thereby more naturally represent the topology of gridworlds. With this modification, the state complexes may exhibit geometric defects (failure of Gromov's Link Condition). Serendipitously, we discover these failures for agent-only cases occur exactly where undesirable or dangerous states appear in the gridworld. Our results therefore provide a novel method for seeking guaranteed safety limitations in discrete task environments with single or multiple agents, and offer useful safety information (in geometric and topological forms) for incorporation in or analysis of machine learning systems. More broadly, our work introduces tools from geometric group theory and combinatorics to the AI community and demonstrates a proof-of-concept for this geometric viewpoint of the task domain through the example of simple environments.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Following-up from our planned changes, we have: 1. Added discussion of prior work on state graphs (added paragraph 2 and modified the final paragraph of the introduction) and emphasised the distinction between graphs vs. complexes in the abstract and introduction. 2. Modified sentences in the introduction and abstract to make it clearer to readers that our proof is for the agent-only case. 3. Added footnote 1 (in Definition 2.2) to explain the notation. Added footnote 3 in the beginning of Section 4 to justify and explain why we limited the study to connected state complexes. 4. Added paragraph 5 in Section 4 to describe how the dances capture the notion of ‘spatially commuting’ moves. (We chose not to add a discussion of ‘temporally commuting’ to Section 2 as it made more sense to mention it here.) Slightly reworded the first sentence of the next paragraph. 5. Added Remark 4.2 to explain why we chose to fill in squares for dances instead of simply adding edges for diagonal moves. 6. Added Remark 5.1 to explain that the terms ‘NPC’ and ‘locally CAT(0)’ are synonymous. Also explained that the stronger CAT(0) property does not hold in general for state complexes (as it further requires simple connectedness). 7. Added paragraph in Section 5 (two paragraphs above Theorem 5.2) to further explicate the connection between links and collision avoidance. Slightly reworded first sentence of following paragraph. 8. Added paragraph in Section 6 (two paragraphs below Remark 6.1) to quantify the reduction in computational workload in computing links due to the main theorem. 9. Cited and discussed all papers suggested by Reviewer 2GJX in the introduction and/or discussion sections.
Supplementary Material: zip
Assigned Action Editor: ~Jeff_Phillips1
Submission Number: 1416