\section{Conclusion and Future Work}
In this paper, we presented the first differentially private algorithms for adversarial MDPs with unknown transitions under both full information and bandit settings. 
Our designs rely on tighter confidence bounds on the components of the transition function, novel central and local Privatizers for transition functions, and adversarial losses separately.
By instantiating the proposed Privatizers, both algorithms are proven to achieve near-optimal problem-dependent regret bounds, satisfying JDP or LDP privacy guarantees.
Further, the bounds also match non-private state-of-the-art bounds in the worst case. 

A natural direction of future work is to close the gap between the upper and lower bounds on regret.
A similar gap remains open \emph{without} privacy considerations, which requires new progress in the non-private setting.
We believe designing refined privacy mechanisms for adversarial MDPs or establishing a lower bound here also leads to interesting technical questions in this domain.  

Considering that occupancy-measure-based algorithms are computationally intensive, it will be a promising direction to privatize policy-optimization-based algorithms, e.g., \cite{luo2021policy,dann23b}. 
Besides, considering MDPs with function approximation \citep{jin2020provably,Sherman2023Improved} also has considerable potential for real-world applications.
% , e.g., linear MDPs \citep{jin2020provably}, and kernelized MDPs \citep{chowdhury2019online}.