% abstract

%problem domain generalization
%Medical images can vary in appearance based on different scenarios, which can impact the performance of machine learning tasks. This is because the data used for training can differ due to these changes in input. Therefore, it is important to find ways to enhance the generalization of models, which is known as the Domain Generalization problem.
% SOTA en ODOC
%The problem of domain generalization in fundus image analysis is a topic of wide study, particularly in the segmentation of the optic disc (OD) and optic cup (OC). However, it has been observed that most models focus solely on the OD area when training the model. This results in reduced image variability and the need to always crop the fundus image for OD and OC segmentation, which has been a significant issue in clinical practice.
%OUR WORK
%In this article, we have conducted a detailed analysis of the Noisy Teacher-Student semi-supervised approach to enhance domain generalization in OD and OC segmentation. To accomplish this, we initially used a set of labeled datasets to train a baseline Teacher model. We then utilized a large unlabeled dataset, which was pseudo-labeled by the Teacher model, to train a robust Student model capable of generalization.
% COMO MEJORA NUESTRO TRABAJO AL SOTA
%We improve the generalization over multiple datasets from different scenarios like different acquisition devices, changes in patient demographics, or the presence of unseen lesions. Our proposed models improve the generalization compared to those present in the literature. For a fairer comparison, we train those models on our same labeled datasets. %ALSO BEEN CROPPED WE HAD SIMILAR RESULTS 
%COMO NOS EVALUAMOS
%We have evaluated a total of eight datasets, which include two known and six unknown datasets. Our findings suggest that using a large unlabeled dataset can significantly improve the generalization of the results over unknown datasets. Furthermore, the overall improvement in the OD and OC segmentation also leads to an improvement in the classification of glaucoma.



% ================

INTRO


%To solve the problem domain generalization techniques were studied in the literature \cite{yoon2023domain_generalization_review}. The goal is to train a model that can perform well on unseen domains, without requiring domain-specific information or adaptation. Domain generalization is challenging because different domains may have different distributions, features, or noise levels, and the model needs to learn a robust and invariant representation that allows generalization across all of them. Current strategies for this goal include simple, straightforward approaches such as data augmentation \cite{lyu_AADG}, or more complex algorithms based on domain alignment \cite{chen_SFDA-DPL} or meta-learning \cite{hu2023map}.

% In this paper we propose a simple semi-supervised learning approach to achieve domain generalization in uncropped retinal images based on the Noisy Student framework.
%In particular, our approach uses a Student model trained on a massive dataset of images pseudo-labelled using a supervised Teacher counterpart.
% This allows the network to capture additional sources of variability during training, while retaining the original cues and patterns used by the Teacher through the weak annotations.




%--------------OLD
%Glaucoma
%Glaucoma stands as the second most prevalent cause of avoidable blindness globally \cite{veena2020glaucoma}. In its most common form, it is characterized by an optic neuropathy produced by pressure-induced damage to the optic nerve. This results in the retrograde degeneration of ganglion cells in the retina, and a progressive loss of vision \cite{weber2008effects}. 
%Being asymptomatic, its early detection and treatment becomes crucial to avoid further damage and prevent blindness. Although currently there are no cost-effective techniques for diagnosing glaucoma, features describing the shape and structure of the optic disc (OD) and optic cup (OC) such as the vertical cup-to-disc ratio (vCDR) or the Disc Damage Likelihood Scale (DDLS) are frequently used as disease predictors \cite{joshua2019segmentation}. 

%problema y que metodos hay para resolverlos
%Characterizing the optic nerve head (ONH) usually requires segmenting both the OD and the OC in advance. A significant effort has been made to automate this task using deep learning \cite{alawad2022machine_review,yoon2023domain_generalization_review,moris2023assessing}. However, current models suffer from drops in performance when applied to samples that differ from the ones used for training and design \cite{srivastava2021comparative}.
%when the performance of the model depends on the use cases they are applied in. 
%GO DIRECTLY TO DOMAIN GENERALIZATION??
%This limitation arises from expected clinical variations, such as using different acquisition devices, changes in patient demographics or scan quality, or due to the presence of unseen lesions \cite{yoon2023domain_generalization_review}. Several techniques have been introduced to mitigate this issue \cite{yoon2023domain_generalization_review}. The most common and straightforward approach is to retrain (or fine-tune) the model with new data samples from the unseen domains \cite{wang2022generalizing_review}. However, this involves manually annotating every possible test set for fine-tuning, which is clinically impractical, costly, and time-consuming \cite{fort2012cost}. Alternatively, other authors have proposed to avoid this by applying domain adaptation techniques \cite{wang_BEAL,chen_SFDA-DPL,yang2021adversarial}. These aim to reduce the gap between the source training set and the unseen target domains. % e.g. by aligning their feature spaces \cite{zhao2022dual}, or generating synthetic data \cite{thambawita2022singan}. 
%However, this approach requires knowing in advance all possible targets and applying domain adaptation to each of them. Furthermore, this is usually achieved through generative models, which can hallucinate features that can degrade performance during testing \cite{liu2022cada,zhang2022unsupervised}. 

%\begin{figure}[t!]
%     \centering
%     \includegraphics[width=\textwidth]{Images/esquematico_v3.png}
%     \label{fig: esquematic}
%     \caption{Optic Disc (OD) and Cup (OC) segmentation results obtained with a baseline and our proposed approach on known and unknown domains. Each example includes the corresponding Dice values. %Optic Disc (OD) segmentation is denoted in blue, while Optic Cup (OC) segmentation is highlighted in green. The ground truth is represented by the dotted line. Additionally, the image includes the Dice metric, providing a quantitative assessment of segmentation accuracy.}
%     }
%    \label{fig:teaser}
%\end{figure}

% Domain generalization SOTA
%The problem of retaining performance on new unseen datasets is referred to as domain generalization, being an active area of research for many machine learning-based applications \cite{yoon2023domain_generalization_review}. The goal is to train a model that can perform well on unseen domains, without requiring domain-specific information or adaptation. Domain generalization is challenging because different domains may have different distributions, features, or noise levels, and the model needs to learn a robust and invariant representation that allows generalization across all of them. Current strategies for this goal include simple, straightforward approaches such as data augmentation \cite{lyu_AADG}, or more complex algorithms based on domain alignment \cite{chen_SFDA-DPL} or meta-learning \cite{hu2023map}.

%metodologia a usar
%Semi-supervised learning (SSL) combines large unlabeled datasets with labeled samples to improve results without extra annotation effort \cite{jiao2023semisupervised_review}. One popular SSL technique is the Noisy Student approach \cite{xie2020noisyStudent}, which applies a supervised Teacher model to annotate unlabelled samples, and then trains a new model (the Student) using a combination of manually and pseudo-labelled images, with increased noise (e.g. using data augmentation, dropout or stochastic depth) \cite{fredriksen2022teacher,koehler2022noisy}. This idea has been previously explored for achieving much better domain generalization in computer vision problems such as image classification\cite{lin2021semi,zhang2022semi,sharifi2020domain}, based on the assumption that exploiting a sufficiently variable set of unlabelled samples might help the model to capture enough variability to improve performance in unseen domains. To the best of our knowledge, however, this simple technique have not been extensively explored as a potential tool for reaching better domain generalization in medical image analysis \cite{yoon2023domain_generalization_review}.% Most of the studies focus on feature alignment which uses more complex methodologies.

%In this paper, we explore the feasibility of using a straightforward Noisy Student approach to achieve domain generalization in OD/OC segmentation from fundus images. We experimentally observed that this simple technique is able to improve segmentation performance on multiple and diverse unseen datasets, achieving much better results than using other state-of-the-art approaches in the field. Furthermore, we also empirically demonstrate that the resulting segmentations are more consistent with the expected anatomical shape of the ONH, allowing to obtain more accurate vCDR values for glaucoma detection.

% In this paper.....
%In this paper, we propose a semi-supervised learning method based on a teacher-student framework to achieve domain generalization for optic disc and cup segmentation in fundus images. Our method consists of training a student model with a large unlabeled dataset, guided by a teacher model that is trained with labeled datasets. Our goal is to enhance the student model’s robustness and adaptability to new unseen domains by exploiting the abundant unlabeled data. We demonstrate that our method can effectively improve the segmentation performance on unseen datasets that have different characteristics from the training datasets. Moreover, we compare our method with other state-of-the-art methods that are tailored for this task and show that our method surpasses them. Additionally, we show that our method can also improve glaucoma diagnosis by providing more accurate vertical cup-to-disc ratio (vCDR) values on unseen datasets.



% =====================

METHODS


%dataset
%
%\subsection{Problem formulation}




%Since the student model lacks knowledge infusion during its training, it is subsequently fine-tuned with the dataset originally used to train the baseline model. This refined model is denoted as Teacher 2 to relabel the unlabeled data and train a new student, contributing to the generation of additional iterations of both teacher and student models.
%
%This iterative process aims to leverage the strengths of each model, progressively enhancing performance and generalization capabilities through multiple teacher-student interactions.