\section{Related Works}




% First, they consider sequential patrol planning while we address patrol planning for a single timestep, leading us to use different criteria for robustness.


\textbf{Diffusion Models}
Diffusion models have achieved remarkable success across various generative modeling tasks, including image generation~\citep{song2021scorebased, ho2020denoising}, decision-making~\citep{kong2024diffusion, kong2025composite}, and scientific discovery~\citep{gruver2024protein, watson2023novo, kong2023autoregressive}. These models are particularly adept at capturing complex, high-dimensional distributions, making them a powerful tool for diverse applications. Conditional diffusion models extend this capability by integrating contextual information to guide the generative process. By conditioning on textual descriptions, semantic masks, or other relevant features, these models enable tasks such as text-to-image generation~\citep{saharia2022photorealistic}, image-to-image translation~\citep{saharia2021palette}, and time series forecasting~\citep{shen2023non}.

% Recent research has explored the use of diffusion models to model environment transitions in sequential decision-making \cite{ding2024diffusion, jackson2024policy, rigter2024world}—a direction often described as learning a “world model.” However, our work takes a different angle by focusing on the limitations of learned diffusion models to enhance the robustness in the non-seuqntial decision-making setting.

\textbf{Double Oracle for Robust Optimization}
Prior work has framed robust optimization as a two-player zero-sum game~\citep{mastin2015randomized, gilbert2017double}, where the optimizing player selects a potentially randomized feasible strategy, while an adversary chooses problem parameters to maximize regret. The double oracle (DO) algorithm is a standard method for computing equilibria in such games ~\citep{mcmahan2003planning,adam2021double} and has been applied to robust influence maximization in social networks~\citep{wilder2017uncharted}, robust patrol planning~\citep{pmlr-v161-xu21a}, robust submodular optimization~\citep{wilder2018equilibrium}, and robust policy design for restless bandits~\citep{killian2022restless}. However, these applications restrict the uncertainty set to a compact interval. In contrast, our problem involves a diffusion model that provides full distribution-level predictions, making the uncertainty set a space of distributions, which introduces new theoretical challenges in applying double oracle.


\textbf{Distributionally Robust Optimization}
Our work is also closely related to Distributionally Robust Optimization (DRO) \citep{rahimian2019distributionally}, which seeks to find robust solutions by optimizing for the worst-case scenario over a set of plausible distributions, known as the ambiguity set. This framework is particularly effective for handling uncertainty and distributional shifts in optimization objectives or constraints. DRO has seen widespread application in areas such as supply chain management \citep{ash2022distributionally}, finance \citep{kobayashi2023cardinality}, and machine learning \citep{madry2018towards, Sagawa*2020Distributionally}, where resilience to data perturbations is critical. However, most existing DRO methods focus on identifying a single pure strategy, which  is dangerous in the green security setting that adversaries can learn to anticipate and exploit. To address this, we propose a game-theoretic approach that derives a mixed strategy for the defender, leveraging randomness to enhance unpredictability and bolster robustness against adversarial exploitation.

% yield deterministic strategies, which adversaries can anticipate and exploit. \ak{To a pure DRO person who is not familiar with game theory, this statement could be slightly confusing. How about something like, ``However, existing DRO methods typically yield a single robust decision, which in our setting translates to a deterministic strategy for the ranger rather than a mixed strategy. This predictability allows adversaries to anticipate and exploit the ranger’s actions''} To mitigate this, we propose a game-theoretic approach that derives a mixed strategy for the defender, introducing randomness to enhance unpredictability and robustness against adversarial exploitation.

\textbf{Green Security Games}
Green Security Games (GSGs) use game-theoretic frameworks to safeguard valuable environmental resources from illegal activities such as poaching and illegal fishing \citep{IJCAI15-fei, hasan2022evaluation}. In these settings, a resource-limited defender protects expansive, spatially distributed areas against attackers with bounded rationality. Prior work focused on forecasting poaching behaviors \citep{gurumurthy2018exploiting, moore2018ranger}, learning attacker behavior models from data \citep{nguyen2016capture, gholami2018adversary, xu2020stay}, designing patrol strategies \citep{IJCAI15-fei, GameSec17-haifeng}, and balancing data collection with poaching detection \citep{Xu2020DualMandatePM}.

Among existing studies, \citet{pmlr-v161-xu21a} is most closely related to ours, as it also employs a double oracle method to design robust patrolling strategies. However, our approach differs in two key ways. First, we are the first to use diffusion models to predict poaching behavior, addressing the limited expressiveness of the linear approach in \citet{pmlr-v161-xu21a}. Second, while \citet{pmlr-v161-xu21a} focuses on minimax regret with interval-shaped uncertainty sets, our work adopts a distributionally robust optimization objective.

% In the typical setting, one player 



% , and double oracle is one of the common framework used to 
% There have been many literatures taking a game-theoretical review to robust optimization, and double oracle is a popular framework to address that robust optimization. 




% a line of literature which takes a game-theoretical view to robust optimization, and double oracle is one of the popular framework used in those literatures. 

% Double oracle (DO) is an algorithm that is used to solve continuous games with both finite strategy space and continuous strategy space. By iteratively generating strategies for both players, the algorithm is guaranteed to converge to the equilibrium \cite{adam2021double}. Besides robust planning in green security \cite{pmlr-v161-xu21a}, DO has also been used as a framework for robust optimization \cite{gilbert2017double}, \cite{wilder2018equilibrium} and designing robust policies for restless bandit problem \cite{killian2022restless}.




