\section{Method}

\subsection{Creation of Synthetic Data using Blender}

\begin{figure*}[t]
  \centering

  \begin{subfigure}{0.19\linewidth}
    \includegraphics[width=\linewidth]{figures/Render_0004_5_gt.png}
    
    \label{fig:short-a}
  \end{subfigure}
  \hfill
  \begin{subfigure}{0.19\linewidth}
    \includegraphics[width=\linewidth]{figures/Render_0004_FOG_09.png}
    
    \label{fig:short-b}
  \end{subfigure}
  \hfill
  \begin{subfigure}{0.19\linewidth}
    \includegraphics[width=\linewidth]{figures/Render_0004_FOG_08.png}
    
    \label{fig:short-c}
  \end{subfigure}
  \hfill
  \begin{subfigure}{0.19\linewidth}
    \includegraphics[width=\linewidth]{figures/Render_0004_FOG_07.png}
    
    \label{fig:short-d}
  \end{subfigure}
  \hfill
  \begin{subfigure}{0.19\linewidth}
    \includegraphics[width=\linewidth]{figures/Render_0004_FOG_06.png}
    
    \label{fig:short-e}
  \end{subfigure}

  \caption{Example of the sequential fogging proceedure. Fog is added to a clean image by gradually increasing the density parameter.}
  \label{fig:fogging}
\end{figure*}

\par We developed a fully randomized open‑sea maritime scene generator using Blender. A pinhole camera with a resolution of \(640 \times 512\), a fixed sensor size, and a \(25 \space mm\) focal length is placed at a randomly sampled height and assigned a random tilt angle. For each scene, a random subset of vessels is drawn from a predefined library and positioned on the ocean surface such that they lie within the camera’s field of view and do not intersect with one another. \par The ocean state—including wave height, wind direction, and additional surface parameters—is randomized to produce diverse water appearances and foam patterns.
The background sky is selected at random from an extensive collection of HDR sky textures ranging from clear to heavily overcast conditions. Both the background illumination strength and azimuthal rotation about the global Z-axis are randomized, allowing the apparent sun position to vary freely over \(360 ^{\circ}\) around the vessels and the introduction of day/night effects.
\par To simulate fog, we enclose the scene in a large volumetric container and employ Blender’s Mie scattering model. For each ground‑truth (clear‑weather) scene, we sample an aerosol particle diameter from the log-normal distribution shown in Fig. ~\ref{fig:lognormal}, centered around \(5 \space \mu m\) with a range of \(1-25 \space \mu m\). For this fixed aerosol size, we then generate \(10\) foggy variants using stratified random sampling of the fog density, where density values are drawn from a distribution proportional to \(1/visibility\). This ensures a controlled progression from light to dense fog while maintaining physical consistency.

\begin{figure}[t]
  \centering
  \rule{0.9\linewidth}{0pt}
   \includegraphics[width=1.0\linewidth]{figures/Lognormal_fog_diameter.png}

   \caption{The log-normal distribution used for sampling the aerosol size. It is centered around \(5 \space \mu m\), has a range of \(1-25 \space \mu m\) and \(\sigma = 0.5\)}
   \label{fig:lognormal}
\end{figure}

\par In contrast to most existing synthetic fog datasets \cite{reside} \cite{dehamer} \cite{dcp_kaiming_he}, which rely on the classical atmospheric scattering model and therefore simulate only single‑scattering events along each ray, our dataset incorporates full volumetric Mie scattering with multiple scattering per ray. As a result, light undergoes repeated scattering interactions as it travels through the fog volume, causing significantly stronger attenuation and yielding darker, denser, and more realistic fog for the same aerosol concentration. This behavior closely reflects real‑world light transport in maritime environments, where multiple scattering dominates visibility loss. Consequently, our dataset provides a more physically faithful representation of fog, improving the reliability of supervised learning for defogging and visibility‑restoration tasks. We mention here that for the time being we neglect absorption effects.
\par For every scene instance, we additionally output a depth map, a segmentation map, and a JSON file containing all metadata required for exact scene reconstruction. All ground‑truth and foggy images are saved as \(8\)‑bit RGBA PNG files, while depth and segmentation maps are stored as \(32\)‑bit EXR files.
Each iteration therefore yields one clear ground‑truth scene together with \(10\) physically accurate fog realizations of identical geometry, lighting, ocean conditions, and camera parameters. This process enables the large‑scale generation of paired (GT, foggy) image sets as shown in Fig. ~\ref{fig:fogging}. \par Ultimately, the goal is to produce thousands of such samples to form a comprehensive maritime dataset suitable for training and evaluating defogging and visibility‑restoration models.


\subsection{Curriculum Training}

\begin{figure*}[t]
  \centering
   \includegraphics[width=1.0\linewidth]{figures/curriculum_figure.png}

   \caption{The proposed curriculum scheme. It consists of five stages, with each stage introducing more synthetic samples of increasing haze density.}
   \label{fig:curriculum}
\end{figure*}

\par We adopt a state-of-the-art dehazing architecture \cite{dehazeformer} as the backbone for our framework and introduce a five‑stage curriculum learning strategy that exploits the monotonic fog‑density structure of our synthetic dataset. We start by testing our approach on \(1\) new fog density level being introduced at each stage, but the end goal is to introduce two per stage, for a total of \(10\) fog density levels. The models remain unchanged architecturally; the curriculum acts solely at the data‑level, enabling staged exposure to increasingly severe fog conditions.
\par Training directly on heavily fogged images often leads to unstable gradients and suboptimal convergence, as dense haze severely obscures scene structure and attenuates high‑frequency details. Our synthetic dataset provides five fog levels per ground‑truth scene, ranging from light to extremely dense haze. This provides a natural ordering of difficulty, allowing the model to first learn basic restoration behavior under mild degradations before progressively addressing more complex scattering effects.
\par The curriculum consists of five stages, as shown in Fig. ~\ref{fig:curriculum}. Stage 1 uses only the lightest fog level. Subsequent stages incrementally add fog levels while retaining all previous ones, culminating in Stage 5, in which all five fog levels are included. Each training iteration draws uniformly from the available levels, and all samples remain in the form of independent (foggy image, ground‑truth) pairs. No multi‑input fusion or joint conditioning on multiple fog levels is used.
\par During Stage 1, the model learns fundamental contrast enhancement and low‑level feature extraction. Stage 2 introduces moderate haze, prompting the network to generalize beyond lightly degraded inputs. In Stage 3, medium fog levels introduce stronger scattering and color distortions, requiring the model to infer missing details. Stage 4 further increases difficulty by adding heavy fog, while Stage 5 exposes the network to the full distribution of degradations. This progression maintains stability by anchoring the training signal with easier examples even as more challenging ones are introduced.
\par Training hyperparameters, including the optimizer and learning‑rate schedule, remain fixed to ensure that improvements stem from the curriculum itself rather than modified optimization dynamics.