Keywords: Causal Discovery, POMDP, Multi-agent systems
TL;DR: We demonstrate how agents robust to domain shifts can infer the causal model of the environment in mediated tasks, multi-agent settings and sequential decision tasks.
Abstract: Understanding the connection between robustness to distribution shifts and learning the causal model of an environment is an important area of study in AI. While previous work has established this link for single agents in unmediated decision tasks, many real-world scenarios involve mediated settings where agents influence their environment. We demonstrate that agents capable of adapting to distribution shifts can recover the underlying causal structure even in these more dynamic settings. Our contributions include an algorithm for learning Causal Influence Diagrams (CIDs) using optimal policy oracles, with the flexibility to incorporate prior causal knowledge. We illustrate the algorithm’s application in a mediated single-agent decision task and in multi-agent settings. We show that the presence of a single robust agent is sufficient to recover the complete causal model and derive optimal policies for all the other agents operating in the same environment. We also demonstrate how to apply these results to sequential decision-making tasks modeled as Partially Observable Markov Decision Processes (POMDPs).
Supplementary Material: zip
Type Of Paper: Full paper (max page 8)
Anonymous Submission: Anonymized submission.
Submission Number: 5
Loading