Abstract: We address the problem of learning from demonstrations when the learner must satisfy safety and/or performance requirements expressed as Stochastic Temporal Logic (StTL) specifications. We extend the maximum causal entropy inverse reinforcement learning framework to account for StTL constraints and show how to encode them via a minimal set of mixed-integer linear constraints. Our method is based on a cut-and-generate algorithm that iterates between two phases: in the cut phase, we use cutting hyperplanes to approximate the feasible region of the non-linear constraint that encodes atomic predicates and in the generate phase, we propagate these hyperplanes through the schematics to generate constraints for arbitrary formulas. Our algorithmic contributions are validated in different environments and specifications.
0 Replies
Loading