Learning to Reason and Act in Cascading ProcessesDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: cascading processes, intervention, reasoning, tree search
Abstract: Training agents to control a dynamic environment is a fundamental task in AI. In many environments the dynamics can be summarized by a small set of events that capture the semantic behavior of the system. Typically, these events form chains or cascades. We often wish to change the system behavior using a single intervention that propagates through the cascade. For instance, one may trigger a biochemical cascade to switch the state of a cell, or reroute a truck in logistic chains to meet an unexpected, urgent delivery. We introduce a new supervised learning setup called "Cascade". An agent observes a system with a known dynamics evolving from some initial state. It is given a structured semantic instruction and needs to make an intervention that triggers a cascade of events, such that the system reaches an alternative (counterfactual) behavior. We provide a test-bed for this problem, consisting of physical objects. We combine semantic tree search with an event-driven forward model and devise an algorithm that learns to efficiently search in exponentially large semantic trees of continuous spaces. We demonstrate that our approach learns to effectively follow instructions to intervene in previously unseen complex scenes. When provided an observed cascade of events, it can also reason about alternative outcomes.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: General Machine Learning (ie none of the above)
TL;DR: We consider the task of controlling the behavior of a cascading process with an intervention at a single point in time. We propose to learn a principled probabilistic scoring function that allows searching efficiently over the space of interventions.
11 Replies

Loading