\section{Existing Chest X-Rays Datasets for Radiological Finding Localization and Segmentation}
\label{appendix:datasets}

We provide in \tableref{tab:91_datasets} an overview of existing publicly available Chest X-ray (CXR) datasets for localization and segmentation of radiological findings. Datasets focusing on only one specific finding, such as those dedicated exclusively to pneumothorax or pulmonary nodules, have been excluded. Examples of the single-finding datasets omitted from this list include \href{https://bimcv.cipf.es/bimcv-projects/bimcv-covid19/}{BIMCV-COVID19+}, \href{https://github.com/lindawangg/COVID-Net}{COVIDx CXR Dataset (COVID-Net)} \href{https://pubmed.ncbi.nlm.nih.gov/31257384}{GRAZ+}\href{https://www.rsna.org/rsnai/ai-image-challenge/rsna-pneumonia-detection-challenge-2018}{RSNA Pneumonia Detection Challenge}, \href{https://github.com/asheshjain399/Pneumothorax-segmentation-CANDID-PTX}{CANDID-PTX}, \href{https://wiki.cancerimagingarchive.net/display/Public/PLCO}{PLCO Dataset (LIDC extension)}, \href{https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation}{SIIM-ACR Pneumothorax Segmentation}.

\begin{sidewaystable}
    \centering
    \caption{Overview of publicly available chest X-ray localization and segmentation datasets.}
    \vspace{2mm}
    \label{tab:91_datasets}
    \small
    \begin{tabular}{p{0.10\textwidth} p{0.17\textwidth}p{0.16\textwidth}p{0.12\textwidth}p{0.14\textwidth}p{0.21\textwidth}}
        \toprule
        \textbf{Dataset} & \textbf{Targets} & \textbf{Size (No. CXRs)} & \textbf{Origin (Country/Dataset)} & \textbf{Available} & \textbf{Notes} \\ \midrule
        \href{https://www.kaggle.com/datasets/nih-chest-xrays/data}{NIH ChestX-ray14} & Disease labels (14), bounding boxes (8; subset) & $112\,120$ frontal views; $880$ with bounding boxes & United States & Kaggle, Google Cloud, and NIH download site & NLP-mined disease labels from radiology reports.\\ \midrule
        \href{https://physionet.org/content/vindr-cxr/1.0.0/}{Vindr-CXR} & Disease labels (6), bounding boxes (22; subset) & $>100\,000$; $18\,000$ with bounding boxes ($15$k train and $3$k test) & Vietnam & PhysioNet & Training set labeled by three radiologists, test set labeled by consensus of 5 radiologists.\\ \midrule
        \href{https://physionet.org/content/vindr-pcxr/1.0.0/}{VinDr-PCXR} & Disease labels (15), bounding boxes (36) & $9\,125$ ($7\,728$ train and $1\,397$ test) & Vietnam & PhysioNet & Labeled by experienced radiologists. Pediatric dataset.\\ \midrule
        \href{https://physionet.org/content/ms-cxr/0.1/}{MS-CXR} & Bounding boxes (8) with descriptions & $1\,162$ & MIMIC-CXR & PhysioNet & Labels verified by board-certified radiologists.\\ \midrule
        \href{https://ai.nejm.org/doi/full/10.1056/AIdbp2401120#ap2}{PadChest-GR} & Bounding boxes (24) with descriptions & $4\,555$ frontal views & PadChest & \href{https://bimcv.cipf.es/bimcv-projects/padchest-gr/}{BIMCV} &  Labeled by a team of 14 radiologists. Multilingual dataset (English and Spanish).\\ \midrule
        \href{https://www.nature.com/articles/s41467-024-45599-z}{CXR-AL14} & Bounding boxes (14) & $165\,988$ & China & Dong Zhang\footnote{hszhangd@tmmu.edu.cn} or \url{cxr-al14.top} & Human-in-the-loop labeling.\\ \midrule
        \href{https://physionet.org/content/reflacx-xray-localization/1.0.0/}{REFLACX + LATTE-CXR} & Eye-tracking with dictated reports, ellipses (localization), anatomical bounding boxes & $3\,032$ frontal views & MIMIC-CXR & PhysioNet & Eye-tracking and dictated reports from five radiologists.\\ \midrule
        \href{https://github.com/Deepwise-AILab/ChestX-Det-Dataset}{ChestX-Det Dataset} & Bounding boxes (13), segmentation masks (13) & $3\,578$ & NIH ChestX-ray14 & GitHub & Labeled by board-certified radiologists.\\ \midrule
        \href{https://stanfordaimi.azurewebsites.net/datasets/23c56a0d-15de-405b-87c8-99c30138950c}{CheXlocalize} & Segmentation masks (10), keypoints & $902$ & MIMIC-CXR & Standford AIMI & Labeled by board-certified radiologists.\\ \bottomrule
    \end{tabular}
\end{sidewaystable}