Joint Representation Training in Sequential Tasks with Shared Structure

Joint Representation Training in Sequential Tasks with Shared Structure

24 Aug 2022 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Classical theory in reinforcement learning (RL) predominantly focuses on the single task setting, where an agent learns to solve a task through trial-and-error experience, given access to data only from that task. However, many recent empirical works have demonstrated the significant practical benefits of leveraging a joint representation trained across multiple, related tasks. In this work we theoretically analyze such a setting, formalizing the concept of \emph{task relatedness} as a shared state-action representation that admits linear dynamics in all the tasks. We introduce the \textsf{Shared-MatrixRL} algorithm for the setting of Multitask \textsf{MatrixRL}~\cite{yang2020reinforcement}. In the presence of $P$ episodic tasks of dimension $d$ sharing a joint $r \ll d$ low-dimensional representation, we show the regret on the the $P$ tasks can be improved from $O(PHd\sqrt{NH})$ to $O((Hd\sqrt{rP} + HP\sqrt{rd})\sqrt{NH})$ over $N$ episodes of horizon $H$. These gains coincide with those observed in other linear models in contextual bandits and RL~\cite{yang2020impact,hu2021near}. In contrast with previous work that have studied multi task RL in other function approximation models, we show that in the presence of bilinear optimization oracle and finite state action spaces there exists a computationally efficient algorithm for multitask \textsf{MatrixRL} via a reduction to quadratic programming. We also develop a simple technique to shave off a $\sqrt{H}$ factor from the regret upper bounds of some episodic linear problems.

Submission Type: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=z6okCeq3TQ&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)

Changes Since Last Submission: We made a mistake in the previous submission and only uploaded the appendix. It was therefore desk rejected.

Assigned Action Editor: ~Nishant_A_Mehta1

Submission Number: 387

Loading