Addressing Real-Time Fragmentary Interaction Control Problems via Muti-step Representation Reinforcement Learning

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Deep Reinforcement Learning; Representation Learning; Real-time control
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We present \textbf{M}ulti-step \textbf{A}ction \textbf{R}epre\textbf{S}entation (\textbf{MARS}) to address fragmentary interaction control problem.
Abstract: Fragmentary interaction control problem is common in real-time control scenarios. For example, the delay or the loss of the network packets (caused by network obstacles, inadequate bandwidth, or switch faults) will lead to dynamic interval or fragmentary interaction. Moreover, fragmentary interaction hinders the application of reinforcement learning algorithms in real-time control tasks: when the states are not received, the reinforcement learning (RL) algorithm cannot make the decision for the agent according to the traditional MDP, which leads to the standstill of the agent, and finally leads to low efficiency or even failure in completing the task. However, such problems are not well studied in the RL community. In this paper, we propose to simultaneously generate multiple actions for future states in case some future states cannot be perceived. We present \textbf{M}ulti-step \textbf{A}ction \textbf{R}epre\textbf{S}entation (\textbf{MARS}) to learn a compact and decodable latent space for the original multi-step action space. Besides, our method enhances the environmental dynamic semantics of the action representation through unsupervised environmental dynamics prediction and action transition scale. Based on MARS, the RL algorithms optimize policies in the learned representation space and interact with the environment by decoding the latent actions to the original ones. MARS outperforms the existing state-of-the-art baselines in a variety of fragmentary interaction real-time control tasks. Further, MARS significantly improves the performance of high-frequency robot control tasks based on fragmentary interaction in the real-world.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3319
Loading