Generalized Policy Iteration using Tensor Approximation for Hybrid Control

Published: 16 Jan 2024, Last Modified: 14 Mar 2024ICLR 2024 spotlightEveryoneRevisionsBibTeX
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Optimal Control, Hybrid Actions, Robotics, Approximate Dynamic Programming, Tensor Approximation
Submission Guidelines: I certify that this submission complies with the submission instructions as described on
TL;DR: The paper proposes a novel approximate dynamic programming algorithm that can handle hybrid action space
Abstract: Control of dynamic systems involving hybrid actions is a challenging task in robotics. To address this, we present a novel algorithm called Generalized Policy Iteration using Tensor Train (TTPI) that belongs to the class of Approximate Dynamic Programming (ADP). We use a low-rank tensor approximation technique called Tensor Train (TT) to approximate the state-value and advantage function which enables us to efficiently handle hybrid systems. We demonstrate the superiority of our approach over previous baselines for some benchmark problems with hybrid action spaces. Additionally, the robustness and generalization of the policy for hybrid systems are showcased through a real-world robotics experiment involving a non-prehensile manipulation task which is considered to be a highly challenging control problem.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 9245