A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

Published: 19 Jun 2023, Last Modified: 30 Jun 2023Accepted by TMLREveryoneRevisionsBibTeX
Authors that are also TMLR Expert Reviewers: ~Pablo_Samuel_Castro1
Abstract: We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We define a new metric under this lens that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). The kernel perspective enables us to provide new theoretical results, including value-function bounds and low-distortion finite-dimensional Euclidean embeddings, which are crucial when using behavioural metrics for reinforcement learning representations. We complement our theory with strong empirical results that demonstrate the effectiveness of these methods in practice.
Certifications: Expert Certification
Submission Length: Long submission (more than 12 pages of main content)
Video: https://youtu.be/or5W73cn-WQ
Code: https://github.com/google-research/google-research/tree/master/ksme
Supplementary Material: zip
Assigned Action Editor: ~Gergely_Neu1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 696