\section{Experimental Setup}
\label{appendix:setup}
As part of our experiments, we explored several design variations for generating semi-synthetic chest X-rays. These variations contemplated the following:
\begin{itemize}
    \item Text prompts: Class labels directly (e.g., ``Atelectasis'') or sampled class-related phrases (see \sectionref{subsec:3_text} and \appendixref{appendix:prompts}).
    \item Mask blurring: No blur and six blurring alternatives using generalized Gaussian filters with standard deviation based on the size of the mask and a factor $r$, i.e., $$\bm{\sigma} = (\lfloor 0.5 \times \mathrm{size(bounding\ box)}\rfloor)/r.$$ Specifically, we tested with $\beta = 2.0$ and $r \in \{0.5, 1.0, 2.0\}$, and $r = 0.1$ and $\beta \in \{4.0, 6.0, 8.0\}$.
    \item Mask conditioning extend: $85\%$, $90\%$, $95\%$, and $100\%$ of the total number of steps of the reverse diffusion process.
    \item Hyperparameters of the diffusion process: 
    \begin{itemize}
        \item Number of steps: $1.0$, $1.33$, and $2.0$ times the default number of parameters.
        \item Classifier-free guidance scale: RadEdit with default, $12.0$, $15.0$, and $19.0$; RoentGen with default, $6.0$, $7.5$, and $10.0$.
        \item Strength: Default, $0.9$, and $1.0$.
        \item Negative conditioning: $\emptyset$ (no negative prompt), and ``No acute cardiopulmonary process.''
    \end{itemize}
    Default settings were used for all other underlying model hyperparameters.
\end{itemize}

\vspace{5pt}
\noindent\textbf{Implementation details.} \emph{SemiSynCXR}'s editing component is built upon the Stable Diffusion Inpainting pipeline from the HuggingFace \texttt{Diffusers} library \cite{diffusers}. For the underlying models, RoentGen weights were provided by the authors (version dated December 31, 2023), while RadEdit weights were sourced directly from the HuggingFace Hub. To mitigate bottlenecks during semi-synthetic image generation, we set the numerical precision to \texttt{bfloat16}, which provides a favorable balance between memory efficiency and numerical stability. Experiments were conducted on an NVIDIA RTX A6000 GPU (48GB VRAM) and the compute cluster from the Chair of AI in Healthcare and Medicine, specifically its nodes equipped with NVIDIA A40 GPUs.