\section{Datasets}\label{sec:datasets}
To test the framework presented in \sectionref{sec:proposed-framework} two datasets were created.
The first one is a collection of ankle radiographs as DICOM images and associated metadata. The second one, which to our knowledge did not exist previously in this or similar form, contains radiographs labeled by radiologists according to diagnostic quality based on anatomical features.
Both datasets contain radiographs from five different X-ray machines.

\begin{figure}[htbp]
	\begin{minipage}[t]{0.57\textwidth}
		% Caption and label go in the first arguments and the figure contents
		% go in the last argument
		\floatconts{fig:datasets:diagnostic-quality-dataset:relevant-structures}{\caption{
			In \protect\subfigref{fig:datasets:diagnostic-quality-dataset:relevant-structures:ap} the most relevant anatomical structures in the \textit{AP} radiographic view are highlighted. These include the joint gap between medial malleolus and talus as well as lateral malleolus and talus. In \protect\subfigref{fig:datasets:diagnostic-quality-dataset:relevant-structures:lat} the joint space between the distal tibia and the talus is highlighted as the most relevant structure for the \textit{LAT} view.
		}}
		{
			\subfigure[]{\label{fig:datasets:diagnostic-quality-dataset:relevant-structures:ap}\includegraphics[height=4.5cm]{images/annotation-example-ap.png}}
			\subfigure[]{\label{fig:datasets:diagnostic-quality-dataset:relevant-structures:lat}\includegraphics[height=4.5cm]{images/annotation-example-lat.png}}
		}
	\end{minipage}
	\hfill
	\begin{minipage}[t]{0.41\textwidth}
		% Caption and label go in the first arguments and the figure contents
		% go in the last argument
		\floatconts{fig:datasets:diagnostic-quality-dataset:quality-example}{\caption{
			\protect\subfigref{fig:datasets:diagnostic-quality-dataset:quality-example:ap} shows an example ROI of a radiograph in \textit{AP} view with perfect alignment in the upper row and strong misalignment in the lower. \protect\subfigref{fig:datasets:diagnostic-quality-dataset:quality-example:lat} shows the same for the \textit{LAT} view.
		}}
		{
			\subfigure[]{\label{fig:datasets:diagnostic-quality-dataset:quality-example:ap}\includegraphics[height=4.5cm]{images/example-quality-ap-1-3-v.png}}
			\subfigure[]{\label{fig:datasets:diagnostic-quality-dataset:quality-example:lat}\includegraphics[height=4.5cm]{images/example-quality-lat-1-3-v.png}}
		}
	\end{minipage}
\end{figure}

\subsection{Weakly Labeled Dataset for Recognition of the Radiographic View}\label{sec:datasets:weakly-labeled-dataset}
We used a dataset of 26542 ankle radiographs provided by the University Hospital Schleswig-Holstein, Campus Lübeck.
From those radiographs we extracted labels for the radiographic view (\textit{LAT} or \textit{AP}) with a keyword matching on the metadata.
The resulting dataset contains roughly 12000 radiographs for each view.
Since creating the metadata is mostly done manually and the content is not standardized, we assume that not all labels are accurate.


\subsection{Diagnostic Quality Dataset}
\label{sec:datasets:diagnostic-quality-dataset}
In order to learn the relationship between the radiographs and the quality, an annotated dataset is needed. To create such a dataset, four radiologists labeled 950 ankle radiographs, containing 475 for \textit{LAT} and \textit{AP} each.

The radiologists determined which objective criteria a radiograph of an ankle has to fulfill, to be of high diagnostic quality. One important criterion, for instance, is the complete visibility of the joint gap between medial malleolus and talus. A high diagnostic quality is a prerequisite for the radiologist to make a correct diagnosis.
According to that criteria, each radiograph was labeled by each radiologist as \textit{1} if the radiograph fulfilled the criteria perfectly, \textit{2} if partly and \textit{3} if the criteria were not met, and a new radiograph would have to be taken. In order to determine whether a radiograph can be used for a diagnosis, the classes \textit{1} and \textit{2} were grouped under the label \textit{diagnostic} and the class \textit{3} was labeled as \textit{not diagnostic}. If the labels differed greatly, the radiologists had a consensus meeting.
Of the \(475 \cdot 4\) labels assigned for the \textit{AP} radiographs, 37\% are \textit{1}s, 53\% \textit{2}s and 10\% \textit{3}s.
For the \textit{LAT} view 17\% of the assigned labels are \textit{1}s, 55\% \textit{2}s and 28\% \textit{3}s.
Examples for the three classes can be seen in the Appendix in \figureref{fig:appendix:example:ap} (a-c) for the \textit{AP} view and \figureref{fig:appendix:example:lat} (a-c) for the \textit{LAT} view.

Additionally, each of the 950 radiographs was labeled with a ROI. As described in \sectionref{subsec:proposed-framework:extraction-of-the-roi} only a fraction of the radiograph is relevant for the diagnostic quality. Therefore, the ROI was labeled as a square containing only the most relevant information. This can be seen in \figureref{fig:datasets:diagnostic-quality-dataset:quality-example}. In \figureref{fig:experiments-and-results:roi-segmentation:example-roi}, which shows examples of ground truth ROI labels, it can be seen that the size of each ROI is highly dependent on the image content.
