\section{Introduction}
\label{sec:introduction}

% CRC + broad intro
\ac{CRC} is one of the most common cancers worldwide and its understanding through computational pathology techniques can significantly improve the chances of effective treatment \cite{smit2020role} by refining disease prognosis and assisting pathologists in their daily routine.
The data used in computational pathology most often consists of \ac{HE} stained \acp{WSI}~\cite{hegde2019similar,lu2020data} and \acp{TMA}~\cite{nguyen2021classification}

% Self-supervised
Although fully supervised deep learning models have been widely used for a variety of tasks, including tissue classification~\cite{kather2019predicting}
and semantic segmentation~\cite{qaiser2019fast}, in practice it is time-consuming and expensive to obtain fully-labeled data as it involves expert pathologists. 
This hinders the applicability of supervised machine learning models to real-world scenarios. Self-supervised learning was proposed to address these limitations. 
It involves a two-step training scheme, where "\textit{data creates its own supervision}"\cite{abbeel2020unsup} to learn rich features from structured unlabeled data and to create supervision from itself.
Applications of this approach in computational pathology include survival analysis~\cite{abbet2020divide} and \acp{WSI} classification~\cite{li2020dual}. 

% Domain adaptation

In addition, different techniques such as stain normalization \cite{macenko2009method} algorithms and \ac{UDA} methods have been developed with the aim of improving the classification of heterogeneous \acp{WSI}. \ac{UDA} methods address this issue by learning from a rich source domain together with the label-free target domain to have a well-performing model on the target domain at inference time.
DANN~\cite{ganin2015unsupervised} for example uses gradient reversal layers, to learn domain-invariant features. Self-Path \cite{koohbanani2020self} combines the DANN approach and self-supervised auxiliary tasks such as the hematoxylin prediction to improve stability. 

Another example is CycleGAN~\cite{zhu2017unpaired}, which takes advantage of adversarial learning to cyclically map images between the source and target domain.
However, adversarial approaches can fall short, because they do not consider task-specific decision boundaries, and only try to distinguish the features as either coming from the source or target domain \cite{saito2018maximum}.

A further issue is that most methods treat the domain adaptation as a closed-set scenario, which assumes that all target samples belong to a class present in the source domain, even though this is often not the case.
To overcome this OSDA \cite{saito2018open} proposes an adversarial open-set domain adaptation approach, where the feature generator has the option to reject mistrusted target samples as an additional class. 
Another recent work SSDA \cite{xu2019self} uses self-supervised domain adaptation methods that combine auxiliary tasks such as image rotation or jigsaw puzzle-solving, adversarial loss, and batch normalization calibration across source and target domains.

% Contribution
In this work, we propose a label-efficient framework called \acf{SRA} for tissue type recognition in histological images and attempt to overcome the above-mentioned issues by combining self-supervised learning approaches with \ac{UDA}. 
We present an entropy-based approach that progressively learns domain invariant features thus making our model more robust to class definition inconsistencies as well as the presence of unseen tissue classes when performing domain adaptation.
\ac{SRA} is able to accurately classify and segment tissue types in \ac{HE} stained images, which is an important step for many downstream tasks.
Our proposed method achieves this by making use of few labeled open-source datasets as well as unlabeled data, that are abundant in digital pathology, reducing the annotation workload for pathologists.
We show that our method outperforms previous domain adaptations approaches in a few-label setting and its potential for clinical application in the diagnostics of \ac{CRC}.

% Our contributions are as follows: 
% (i) we present that self-supervised learning techniques are well applicable to this task in a few-label setting, which alleviates tedious manual labeling and allows us to benefit from unlabeled data that are abundant in digital pathology, 
% (ii) we present an easy-to-hard learning procedure based on entropy quantification to measure the similarity prediction uncertainty of samples between domains, 
% (iii) we show that \ac{SRA} can benefit from open-source data, further guaranteeing its effectiveness for clinical application, and 
% (iv) we show that our method outperforms previous domain adaptations approaches in a few-label setting.



